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We'll start this class with a motivating question: 


Example 1 
Imagine an infinite honeycomb lattice in which we do the following walk: we start at a vertex, walk along an edge, 


and then at each vertex we visit, we toss a coin to decide whether to turn left or right. (Paths can be retraced 


during the walk.) If the lattice is small, then the traced-out path looks more like a continuous curve. How would 


this random curve look in the limit of small mesh? 


It makes sense for the walk to cover the whole plane eventually, and it’s plausible that this curve will converge to 
two-dimensional Brownian motion. After all, if we have a random walk on a square lattice, where all four edges 
are allowed at each step of the walk, all increments of the walk are lid random variables. Since the sum of iid random 
variables is approximately Gaussian, this process then has to be Brownian motion (we can focus separately on the 
horizontal and vertical parts of the walk). But some of this breaks down when we have a hexagonal lattice as in our 
problem above — we now have “near-independence” of different steps instead of complete independence, and our walks 
tend to move in the direction that they're currently going. 

To make a bit more progress on this, suppose we take a northward step at some point, and then we wait until 
we get to the next northward step and see what’s happened in the meantime. We have a Markov chain on the six 
possible directions that the walk can be traveling (which is basically a random walk on a cycle), and we can look at the 
increments between north steps. But the increments between north steps are lid, so now we can apply the Central 
Limit Theorem and eventually get to the answer that yes, this converges to Brownian motion. 

This is an example of universality in action: even though the hexagonal lattice is differently structured from the 
square lattice, we still get Brownian motion, and this Brownian motion is a universal object that arises as the limit 


of many different models. 


Example 2 
Let’s now change our setting a bit: suppose that if we have already hit a given hexagon and turned in a certain 


direction, then in the future we will make the same type of turn if we hit that hexagon again. Equivalently, when we 


hit a hexagon head-on, color it either blue or yellow (for left or right), and then whenever we hit an already-colored 


hexagon, continue along the designated direction again. 


If we additionally condition the path to reflect off of itself whenever necessary to avoid making a loop (we can 
think of this as a double cover of the plane, where instead of hitting the path we switch to the opposite-colored 
arrangement of the plane), and consider a fine mesh for this type of walk, we get a random walk which hits itself but 
never crosses. The limiting curve here is what is known as radial SLE(6), and we'll learn about it more later in the 


course. 


Problem 3 
If we run both the Brownian motion walk and the Radial SLE(6) walk until both of them hit a fixed outer box, 


which will have a larger “outer boundary?” (It makes sense that the SLE curve always keeps expanding, but it also 


seems like that curve will hit the boundary faster.) 


It turns out that the answer is “they are the same” — in fact, the probabilistic law of the outer boundaries are 
identical in the two situations. The result turns out to be a form of SLE(8/3), and it has fractal dimension $ 


(meaning that the number of € balls needed to cover the curve scales as €~4/?). This was conjectured by Mandelbrot 


essentially by inspection, but it took another 20 years after his conjecture for the result to be proved. 


Fact 4 
This equivalence of the outer boundaries is difficult to see locally using the combinatorics of the walk — this Is in 


particular because the trace of the Brownian motion has dimension 2, but the trace of the radial SLE curve has 


dimension i So it’s remarkable that the law of what's going on inside is completely different! And in fact, we 


can prove convergence to Brownian motion for any net-zero-drift 3-regular periodic graph, but the SLE proof only 


works specifically for the hexagonal lattice. So there are definitely some miracles going into this. 


This gets us to the idea of conformal invariance in complex analysis: recall from the Riemann mapping theorem 
that if we have two simply-connected subsets of the plane as subsets of C, we can always find a conformal (angle- 
preserving) map from one to the other. Brownian motion has the nice property that it's invariant under conformal 
maps. In particular, if we start a Brownian motion on one domain Dj, stop it when it hits the boundary of D,, and 
then we apply the conformal map to another domain D2, then (up to a time-change) we get the probabilistic law of 
Brownian motion on Dz. And it turns out that any two curves with this “symmetry up to a time-change” property are 
equivalent in some way, and that’s the fundamental reason why these two curves look the same. 

With that, we've finished our lead-in to the main topic of today’s class, which is “local sets of the Gaussian free 
field.”. Recall that the standard Gaussian in an n-dimensional Hilbert space has density (27)~"/2e—\“")/2, and a 


sample from this standard Gaussian can also be written as 


n 
X~ ) Ai Vi 
i=1 


for an orthonormal basis {v;} of the Hilbert space and iid standard (1-dimensional) normal coefficients a;. (In particular, 
it doesn't matter what basis we choose, and it doesn’t matter what inner product we use, up to the constant in front 


of the density.) We'll use this to define the discrete Gaussian free field: 


Definition 5 


Let f be a real-valued function on the vertices of a planar graph. The Dirichlet form is defined via a sum over 


edges of the graph 


(Fav = do(F(X) - F(y))(9() — 9) 


xvy 


(this can be thought of as a dot product of gradients), and the Dirichlet energy is defined as H(f) = (f, f)v. 


Notice that in classical physics language, (f, f)y can be thought of as a potential energy of a system of harmonic 
oscillators: the restoration force for a spring is linear in the displacement, and thus the corresponding potential energy 
is proportional to the squared displacement. So if we have a mesh of points (like the boxspring mesh for a mattress), 
and the vertices are constrained to only move perpendicular to the mesh, then the potential energy depends on the 


heights of the points in the mesh! 


Problem 6 


Consider the set of points where Se x? = N (so we have a vector on the sphere of radius VN). If we pick a 


uniformly random point on this sphere, what is the law of x;? 


By symmetry, x has expectation 1 and x, has expectation 0, so we know we have a distribution with mean 0 
and variance 1. But to get to the actual answer, we can notice that if we choose (x,,--- , xy) from the standard 
Gaussian that we've described above, we'll have a x? approximately equal to N (with error within VN), and that 
error is negligible for large enough N. So looking at the form of the Gaussian density, if we think of the squared value 
x? as being an energy, then we're saying that the probability of finding the energy to be some E at a given point is 
proportional to e-°*. This is actually a more general phenomenon, and it’s related to the Boltzmann distribution in 
statistical physics — what we'll find in this course is that this is why e£"*"9Y is often the probability measure that we'll 


see. 


Definition 7 


The discrete Gaussian free field (DGFF) on a function f on the vertices of a planar graph is a random element 
—H(F) 


(on RY, potentially with some boundary conditions) with density proportional to e 


We can work out a few properties of this DGFF now by relating this to a discussion of harmonic functions. We 


can define the discrete Laplacian on our graph to be 
Af (x) = meanyaxf(v) — F(x), 


and we say that a function f is discrete harmonic if Af = 0. We can notice that we'll get a Gaussian random variable 
for the value of f at a given point if we condition on fixed neighbors, and no matter what boundary conditions we have, 
the expectation of the field will be harmonic. Furthermore, what we have here is a Markov random field (meaning 
that conditioning on a subset only depends on the boundary). 

But we can now move to the continuum Gaussian free field by basically doing this process on an infinite-dimensional 
Hilbert space instead, taking the number of vertices in our graph to infinity. It turns out that if we make our mesh 
smaller and smaller, the function oscillates too quickly to be well-defined at particular points, but the average value on 


a positive Lebesgue measure set is well-defined. What we have to do Is make our definition 


(f.g)v = fvt-vs, 


meaning that the Dirichlet energy becomes (f, f)v (or in other words the squared L? norm ||Vf\||3). If we then take 


{f+ to be an orthonormal basis', and we consider >>, aiff for iid standard Gaussian a;, we can declare that our sum 


is the Gaussian free field that we want. (It turns out that if we make this definition of | h = S- ajf; | and we take our 
i 


f;s to be sines and cosines, we end up with a Brownian bridge. So it’s nice that this construction works in “higher 


dimensions” as well!) 


Fact 8 
A sample drawn from the Gaussian free field is not defined at a particular point, but it is defined once we average 
over a large enough subset and treat h as a distribution. More generally, notice that for any test function g, the 


Dirichlet form (h, g)v will be a Gaussian with law N(0, (g, 9)v) (by analogy, if x;, x2, x3 are standard Gaussians, 


then 3x, + 2x2 + 7x3 has variance ||(3, 2, 7)||3), and the covariance Cov((h, f)v, (h, g)v) will just be (f, g)v. 


Since everything has zero mean if we have zero boundary conditions, that’s enough to define what h is, because 


a Gaussian random variable is defined by its covariance structure. 


If we take the Laplacian of our Gaussian free field now, we can notice that (by integration by parts, using zero 


boundary conditions, and writing out the various components) 


(f, g)v = (Af, 9). 


In particular, this means that if g is the Gaussian free field, and we want to integrate it against f, we can use a test 
function Af. Returning to the motivating questions at the beginning of class, now generate a hexagonal color model by 
taking a Gaussian free field and coloring in the hexagons based on whether the free field is negative or positive on each 
hexagon. We can now follow the interface and look at the path in the limit of small mesh — it turns out that in this 
case, we also get a natural object, and it’s instead the SLE(4) curve! Basically, SLE curves are the most natural family 
of curves that don’t cross themselves, and the number parameter is basically determining the Hausdorff dimension 
(2 in this case) and how windy the path turns out to be. What we have here is known as a local set, and we'll see 
more of this in future lectures. (To be prepared, we should make sure to do our assigned reading before the next 


lecture!) 
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Example 9 


The class started with a demonstration of the “procedural maze generator,” which we can find at https://www. 


youtube.com/watch?v=xqqGXfZpsmU. 


We'll discuss the Gaussian free field some more today, but we'll start by discussing Wilson’s algorithm for uniform 
spanning trees (invented by David Wilson in 1996 at MIT). 


Problem 10 


A spanning tree of a graph is a subgraph that contains every vertex and is a tree (so it has |V| — 1 edges). How 


can we generate a uniformly random spanning tree of a given graph like a rectangular grid? 


1To be precise, take the set of smooth functions with zero boundary conditions on a domain D, and take the Hilbert space closure of 
that set. This is then a Hilbert space because it’s closed. 


One thing we can do is to count the total number of spanning trees, enumerate them, and pick a random number 
between 1 and the number of trees. But this is very inefficient (the number of trees is exponential in the number of 
vertices for a rectangular grid*), and we want something that works better. 

Instead, here’s an alternate strategy: start with a vertex and declare it to be our root. Then pick another vertex and 
start doing a loop-erased random walk from that vertex, meaning that we erase any cycle that we form. Eventually, 
that random walk will hit our root, and that’s a new part of our spanning tree. 

From there, we pick another location and do the loop-erased random walk again, stopping when we hit the part of 
the spanning tree that we've already constructed. This process continues until all vertices are connected to the root, 


at which point we have a spanning tree. 


Fact 11 


It turns out it doesn’t matter where we start our random walks — for example, we can enumerate the vertices and 


start from them in numerical order from smallest to largest. That's what's going on in the maze generator video 


above. 


This algorithm turns out to be provably as fast as possible when we potentially have different edge probabilities 
(so that we have a weighted graph). But let’s understand why this actually works for generating a uniformly random 


spanning tree: 


Example 12 


Suppose that we want to generate a uniformly random spanning tree on a graph using Wilson’s algorithm. The 


idea is to use something called cycle popping: suppose that under every vertex of this segment except the root, 


we have an infinite deck of cards. 


For each card, we write on it one of the possible directions to a neighbor of its corresponding vertex, uniformly at 
random. Now, look at the top cards on the decks and see if they form any cycles. If they do, then we can grab those 
cards and toss them away, and now the next cards on those decks are on the top. Repeat this process until there are 
no cycles on the tops — this will generate a tree for us, because we have exactly one fewer edge than vertex overall. 

But if we think of the cards as deciding the steps that we take when we do our random walk (choosing our next 
step by grabbing the top card at that vertex), every cycle we popped must actually have been loop-erased during 
Wilson's algorithm. So if we have a d-regular graph, a given collection of loops L (in some specific order), and a 


specific spanning tree T, then the probability P(L, 7) of seeing the necessary cards to form those loops and trees is 


ru.n=(3)" 2)" 


where |L| and |T| denote the total number of edges that make up L and T, respectively. Because this expression 
factors, we have independence of - and 7, meaning the probability of seeing various trees doesn’t depend on the 
loops that were formed by the random walks. Furthermore, because all trees have the same value of |7| = |V| — 1, 
all trees are indeed equally likely to show up. (And if we had a graph which isn’t regular, or we wanted to sample a 
weighted tree, we would get different probabilities in the expression for P(L, T). But the idea is the same.) 

And we can now connect this back to the Gaussian free field in the following way: suppose we have a maze as 


generated above. Define a function by keeping track of, from every point in the maze, the number of times we wind 


2It's at most /2” by drawing a “honeycomb” on every other column and picking whether the remaining vertices connect to the left or 
the right, and it’s at most 4” because each vertex only has 4 neighbors. 


around that point before getting to the root. This function is random, and if we make a very fine mesh of that random 


function, it turns out to be the Gaussian free field on the graph. 


Example 13 


Kirchhoff’s matrix-tree theorem is another way to count the number of spanning trees for a given graph. We'll 


state it shortly, but the setup is connected to the following argument. 


Direct all edges of the graph arbitrarily, enumerate the vertices and edges, and define an edge-incidence matrix 
M with |E] rows and |V| columns, where Mj = 1 if edge / points towards j, Mj; = —1 if edge / points away from J, 
and Mj; = 0 otherwise. (This matrix M can be thought of as a function from RY! to R!®! which encodes the discrete 


gradient of a function on vertices.) Notice that for any v € R'!, we have 
v'M! Mv = |My? = || V5, 


which is the Dirichlet energy of the function v that we discussed last class. So the matrix M™M, which encodes the 
inner product, is actually the negative Laplacian matrix of the original graph (since applying M sends a 1 at a vertex 
to a 1 at all adjacent edges, and then applying M’ turns that quantity to d at the original vertex and —1 at all the 
neighbors — up to a constant factor, this is the discrete Laplacian we've been talking about). Furthermore, notice that 
this is actually related to the formula (v, v)y = (v, —Av) that we got in the continuum case last time! To understand 


the connection to the matrix-tree theorem, we'll write out the result and proof: 


Theorem 14 (Kirchhoff's matrix-tree theorem) 
Let L be the Laplacian matrix for a graph G (so L has deg(v) entries on its diagonal, —1s for adjacencies 
off the diagonal, and 0 otherwise). Since the rows of this matrix add to 0, L is singular. If we delete any 


row and corresponding column (of the same index), the determinant of the remaining matrix L’ (alternatively, 


d1d2 -++ pi) is the number of spanning trees of G.* 


@We can remove just one row and column, because if Af = 0, then f must be constant — indeed, 0 = (f, f)v = )2>.(VF(e))* only 
if all edges have zero difference. 


To prove this, we can rely on the Cauchy-Binet formula for an m x n matrix A and an n x m matrix B 


det(AB) = © det(Atm,s) det(Bs tm). 
se(") 
which says that we sum over all possibilities that make A and B into square matrices. If we now look at the determinant 
of L’, which is L with some row and column removed (think of that row as the root vertex), we have by Cauchy-Binet 


that 
det(L’) = S/ det(Ns) det(WZ) = S© det(Ns)?, 
S S 


where NV is the matrix M above but with the column corresponding to the root vertex removed. Now observe that 
picking a subset S of the rows of size |V| — 1 means we pick |V| — 1 of the edges of our graph, and then Ns outputs 
the discrete gradient corresponding to those edges. But if that subset contains a cycle, we can feed in a function 
which is nonzero but constant on that cycle, and that will be sent to 0 by the gradient function (so Ns is singular). In 


other words, we can prove that det(Ns) is nonzero if and only if the edges S form a tree on all but the root vertex, 


and in fact it’s +1 if it’s nonzero because we can order our vertices to get an upper triangular matrix. So in total, 


det(L’) gets a contribution of 1 coming from every distinct spanning tree, as desired. 


Example 15 


Suppose we want to count the number of spanning trees of a torus using the matrix-tree theorem. Because we 


have the alternate expression for det(L’) in the theorem above, we can do this by calculating the eigenvalues of 


the Laplacian matrix instead of computing a determinant. 


The idea is that the eigenfunctions of that Laplacian are “sine waves” of the form e2"/"™ktY) for 0< k,j<n—-1 
(these are known as the Fourier modes), and they're orthonormal under the ordinary inner product and have easily 
computable eigenvalues. This is actually related to how we can generate an instance of the Gaussian free field 
numerically — since the GFF is a linear combination >>; a;f; where a; are standard normal and {f} is an orthonormal 
basis, we can use the fact that (f, g)v = (f, —Ag) and plug in our normalized eigenfunctions {f;} for f and g. Then 
because Af; = A; fj, we have 

(fi, fi)y = (fi, -Afi) = —Ai. 


In other words, being orthonormal under the Dirichlet and normal inner products is different by a factor of \/Xj, but 
orthogonality holds in both cases. So this is exactly how we generate the GFF using Mathematica — for every vertex, 
1 


we generate a Gaussian random variable, and then we also include the factor Tx where 
1 


aes (¢-¥")" i (“ ay 


Then taking a Fourier transform gives us the sum 5> ai efi that we want. 


To finish today’s class, we'll discuss Green's functions using the following puzzle (which Professor Sheffield was 


asked about when talking to David Wilson at Microsoft Research). 


Example 16 


Suppose we have an integer number line with sites from —n to n, and we start at the position 0. What is the 


expected number of steps needed to reach one of the endpoints? 


The solution is to write a difference equation by letting f(k) be the number of steps needed to reach an endpoint 
if starting from k — since f(k) = $(f(k +1) + f(k —1)) +1, that tells us that f is quadratic with leading coefficient 


—1 because Af = 1, and it turns out that f(k) = n? — k?. And this “Laplacian of a function being equal to 1” comes 


up in Green's functions as well: 


Definition 17 


Let D be a domain, and let Gp(x, y) be the expected number of times we hit y if we start a random walk at x 


and stop when we hit the boundary of D. 


This expectation Gp(x, y) will be discrete harmonic if x is far away from y, since a walk at x needs to immediately 
go to one of its neighbors. But we're not harmonic everywhere because we get an extra bonus of 1 at y, and that’s 
where the “Green’s function” concept comes in. Furthermore, this function turns out to be symmetric, because it's 
the sum of probabilities over paths from x to y, and that probability is also the same if we're going from y to x. But 


we'll learn more about this next time! 
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Example 18 


We'll start today with some basic facts and observations about random variables. 


- An exponential random variable of rate X has density Xe~*”, and its Laplace transform can be computed via 
fe ma [ree = r 
F A+t 


(because if we had a factor of (A+Tf) in the integrand instead of 4, we'd be integrating the probability density of a 
random variable). Replacing —t with t or it instead gives us the moment generating function or characteristic 
function, but they're all essentially the same up to analytic details as long as our functions decay quickly enough. 
And under certain conditions, the Fourier transform is invertible, so the Laplace transform determines the law 


of the random variable. 


« Now suppose we have a normal random variable with density (letting A = 4 for convenience) 


x? ) _ A Ae /2 


1 
fx (x) = ofan’ (- = 


Then we can compute the quantity 


[A A 
7 Bx? /2) _ SY a (AtFB)x?/2 _ 
e 7 On” A+B 


using the same logic as above. 


Fact 19 
For a different “warmup,” the video https://www.youtube.com/watch?v=QJRnWa_yQq8 is another example of 


Wilson's algorithm in action, but it also colors in the maze based on distance from some starting point (taking 
the fastest possible route in the maze). And we can also see David Wilson's illustration of the algorithm on 


http://dbwilson.com/, in which the entire boundary is counted as a single point. 


The focus of today’s class is to connect some of the topics we've introduced in the first two lectures: (1) the 
Gaussian free field, (2) the Laplacian, (3) the Green’s function Gp, (4) uniform spanning trees, and (5) Wilson’s 
algorithm, along with its loops and trees. We discussed something between (1) and (4) last time, but we'll switch 
gears today and see what linear algebra and other techniques can do for us. 

Recall that Gp(x, y) is defined as the number of expected visits to y when a random walk starts at x and stops 
when it hits the boundary of D. We found last time that Gp(x, y) = Gp(y, x) if we're on the d-dimensional lattice, 


because we can rewrite 


co co 
1 
Gp(x,y) = S > P(walk at x hits y after exactly k steps) = S- ——— (number of paths from x to y in k steps) 
k=0 a ee 
by linearity of expectation, and then we can notice that each path is reversible. We also started to discuss how Gp and 
A are in fact inverses: if we think of Gp(x, y) = f(x) as a function of x alone, then it’s discrete harmonic except at 
y.° Therefore, Afy (x) = —1{x = y} with the definition of the discrete Laplacian that we have, and thus AGp = —/. 


3The expected number of visits is a weighted average over the expected number of visits at x's neighbors, except if we’re at y, we get 
one extra visit at time 0. 


And to connect these objects back to the Gaussian free field, recall its density function e~(ffv/2 = e-(f-4f)/2 can 
be written in terms of either the Dirichlet energy or the Laplacian. But then the Green's function is the covariance 


of the Gaussian free field: 


Proposition 20 


Let [ be a GFF ona graph. Then for any two vertices x, y, we have 


I(x) (y)] = Got y). 


The way we can prove this is by using the Markov property: we know that if we fix the value of [ at x, what's left 
conditioned on F(x) is the GFF on our new punctured domain D\ x, and the expectation of [(y) is then the harmonic 
interpolation on D\ x. So the expectation is harmonic as a function of x away from y and vice versa, just like Gp(x, y) 
is; in particular, this means F(x) is the average of its neighbors plus a random independent normal random variable. 
That means that the covariance also has the extra term of 1 at x = y because that's exactly the expectation of a 
squared standard normal random variable; thus, it’s exactly the function Gp(x, y). 


We can now take the Markov property we were discussing earlier and discuss it in more detail: 


Definition 21 


Let B be a subset of a domain D. Let Ig be defined by observing the values of [ in B and then harmonically 


extending to D \ B, and let F? =f —Tg. 


We can notice that F® is now a standard GFF with Dirichlet boundary conditions (because we've subtracted off 
the boundary conditions from B), and Fg and F® are independent of each other because everything is a Gaussian 
process here. And in fact, we can observe our vertices one at a time and condition on what we've seen so far to 
construct the GFF explicitly with this Markov property. What that means is that we can take our “random walk on a 
hexagon” from class 1, turning left and right and forming a path of vertices B. Then as long as we only need to look 
at those to determine [8, we have an equivalent to the notion of a “stopping time” but in space. 

Now returning to our basic calculations from the beginning of class, we can generalize our density functions to 


more dimensions and write 
f(x) = Vdet A(2m)~9/2@7 (A%%)/2 


for some (usually symmetric) matrix A. So this determinant term (coming from A or Gp in the GFF) tells us the 
magnitude of our Gaussian free field’s density in some e-ball near 0, which is useful because it helps us answer 
questions like “what is the probability of our small mesh being O at a set of vertices near 0,” and that answer shouldn't 
depend on our local structure of the graph — it’s essentially a normalization factor that counts the number of allowed 
configurations. 

In particular, the probability that the value of the GFF at a single vertex x in our graph is within an é€ distance of 0 
iS proportional to 4 (where a is the standard deviation of some Gaussian). Then given that value, the probability that 
the next vertex next to it is within an € distance of 0 as well will come from the Green's function on the remaining 
domain D \ x, and so on. The overall probability that everything is near zero is then related to the prefactor in the 


density and is obtained as a multiplication of a bunch of factors, and that motivates the following result: 


Proposition 22 


The determinant of Gp, which ts the inverse of the determinant of the Laplacian, is 


n 
det Gp — Il GD\ 04, Rese xe 


Ja 


We also have the following Laplace transform which is a generalization of the ace result from before: 


Proposition 23 
Let F be a Gaussian free field in D of n vertices with Dirichlet boundary conditions. Then for any nonnegative 


vector k € R", we have 


1 | det(—Ap) 
u | Exp 5 DL kOG)E (5)? a Eee 
j=l 


where /, is a diagonal matrix with entries of k. 


Here, we should think of F? as being another random field, and this result helps us determine its law by taking the 
dot product of [? with an arbitrary function k. The proof is essentially the same as what we did in the one-variable 


case — we require k to be nonnegative so that our matrices —Ap and —Ap + /, are positive-definite. 


Remark 24. /n special cases, the formulas for det(—Ap) can be found in papers such as “The asymptotic determinant 
of the discrete Laplacian” (Kenyon 2000). And that work is useful for looking at scaling limits, trees and matchings, 


properties of SLE, and so on. 


We'll now introduce a related object, the massive Gaussian free field, by adding a term in the Dirichlet energy 
So(vr(e))? + SOV (e))? + D5 KON)? 
e e x 


for a mass function k(x), often taken to be constant. (This is motivated by having both a “potential energy” and a 


“kinetic energy” term, much like in a spring-mass system in physics. ) 


Remark 25. We can also add a conductance term c(e) to weight the edges as well — in particular, we can imagine 
that every interior point in our domain D is connected very weakly to the boundary, which is fixed at height 0, so that 
we reduce some fluctuations in our Gaussian free field.*. Then the k(x)I(x)? term is really just like a c(e)(T(x) — 0)? 


term, so adding the massive term is really the same generalization as these extra edges. 


Remark 26. These ideas can be extended to the continuum case, extending the Laplacian determinant calculation 
to a manifold, but that’s ongoing work right now (taking a mesh and normalizing appropriately). The trouble here is 


basically that the product of the eigenvalues will go to infinity unless accounted for properly. 


To finish today’s class, we'll give a hint of what's to come next: if we use a continuous time random walk (so 
each step is taken after an Exp(1) amount of time) instead of the discrete time random walk, our Green's function 
will still be the same (because it’s defined as an expectation) but the resulting object can be more natural. In this 
version, we can now think about the occupation time of Wilson's algorithm (the total amount of time needed to 


produce the spanning tree). If we want to understand that law, it makes sense to say something about the Laplace 


4This comes up if we have a “killing” process in which there is a probability of terminating the walk early at each step. 


10 


transform, and essentially what we can do is add the “weak edges” from interior points to the boundary and create 
a new graph. Then we can calculate the probability of never using those weak edges during Wilson's algorithm, and 
that has to do with how long we survive in our loop-erased random walk! That gives us an expression of the form 
TI vertices @ Te SPM at vertex, and that factor is also the ratio of the determinant corresponding to the original and the 
augmented graphs — that’s what shows up in Proposition 23. So through Laplace transform magic, we can show that 
the occupation measure coming from Wilson’s algorithm agrees in law with the square of the Gaussian free 


field, even though these things may seem unrelated at first. 


4 September 21, 2021 


We'll start with a puzzle today: 


Problem 27 


Suppose we have a discrete lattice D with some boundary OD, and our Gaussian free field y(x) is allowed to 


fluctuate within that domain. How can we compute 


E | [(4y)x 


xED 


(the average product of the Laplacian evaluated at all points)? 


A relevant fact to know here is Wick’s theorem, which helps us compute products of random variables which are 


jointly Gaussian. This is particularly useful in particle physics and quantum field theory: 


Proposition 28 (lsserlis’ / Wick's probability theorem) 


Let (X1,-+- , Xn) be centered (mean zero) joint Gaussian, so that the distribution is determined by their covariance 


[X1X2°--Xal = 4° [PT TPE Xd = SO TY] [coli 


peP? i.jEp pEP? iJjEp 


matrix. Then 


where Pe is the set of partitions of the integers {1,--- ,n} into pairs (so an example of a partition p would be 


{{1, 2}, {3,4},--- ,{n—1, n}} for n even). In particular, this expectation is zero if n is odd. 


The proof is basically an integration by parts argument. We can notice that there are (n — 1)(n — 3)---(3)(1) 
ways to form pairings if we have n Xjs, so in particular this immediately gives us the formula for the moments of a 


standard normal random variable (by setting all X; exactly equal to X): for example, if X is standard normal, then 


H[X°] = (6 — 1)(6 — 3)(6 — 5) E[X?] = 15. 


So we can now take a look at the puzzle from the beginning: notice that for any x we have 
Var(Ay)x =1 


because no matter what the neighbors of x evaluate to, x will be a normal random variable centered around its 
neighbors’ averages. In fact, the quantity (Ay), is independent of all of the other information {7, : y 4 x} (because 
the random fluctuation is essentially independent of the harmonicity). Furthermore, if x and y are not adjacent and 
not equal (by the independence above), 

Cov((Ay)x, (AY)y) = 0. 
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If they are adjacent, then we can imagine that x and y are the two interior lattice points of a 3 x 4 lattice grid. 
Subtracting off the harmonic extension outside the lattice, we can further assume there are Os all around the boundary. 
So all that’s left is to compute the covariance in this one case: the variance of yy is the Green's function evaluated at 
(x, x), which is the number of times we expect to return to x before hitting the boundary of Os. There's a + chance 
of success each time, so this means that the variance of the value of the Gaussian free field is 

1 16 

Gp(x, x) = Var =1l4+ a+]... 

po( ) (Yx) 16 15 
Similarly, we can compute that Gp(x, y) = +. To confirm that our calculations are correct, we can check that this 
function is harmonic everywhere (as a function of x) except at x, where the Laplacian is 1. And now we can go back 
to the Laplacian covariance by plugging in the definition (AY). = Yx— sree 7Vy on a lattice grid: we thus want to 


compute 


where Cov(X, X) = Cov(Y, Y) = 72 and Cov(X, Y) = 4. This evaluates to 


17 4 1 16 | 1 
16 15 2 15 | 4] 


17 1 il 
= 6 COV: Y)- gcolX. Xx) a Zool”, a 


To summarize, the covariance of the Laplacian at two points x and y is 1 if x = y, —; if x is adjacent to y, and 0 


otherwise. So by Wick’s theorem, this expectation is ane times the number of perfect matchings in the grid 
(since we only want to pair up points along edges, or else we get O in the product). And if we’re doing this on the 


square grid, this kind of matching is also appropriately called a domino tiling. 


Fact 29 
On a rectangular grid with one fixed “red” vertex removed, the problem of counting domino tilings is somewhat 


subtly related to the problem of counting spanning trees — this is known as the Temperleyan bijection. The 


idea is as follows: take the vertices which are (2Z)* away from the red vertex and color them blue. Then given 


a perfect matching on the original grid, construct a tree on the red and blue vertices (which now form a smaller 
rectangular grid) by extending all edges that are connected to a blue vertex. This process never results in any 
cycles, because (if so) we'd have an odd number of edges inside the cycle and we wouldn't have had a perfect 
matching originally. Since we get one edge per blue vertex, this means we have the right number of edges to form 


a spanning tree on the red and blue vertices. 


Remark 30. Our first problem set is due October 14, and it'll basically go over the Werner and Powell lecture notes 
that we've been following along. We'll need to write down a few short paragraphs about each chapter and solve a few 


exercises. (We shouldn't worry too much about the grade aspect — the problems should be fun and enlightening.) 


One of the problems discusses Brownian loop-soups, which we'll now start to understand in class. The idea is that 
a “Brownian loop,” a Brownian motion conditioned to return to its origin at time t, is a continous version of a simple 
random walk conditioned to return to its origin after n steps. In the continuous case, we know that the probability of 
return at any fixed time t > 0 will always be zero, but we can think of the Brownian motion in a slightly different way: 
motivated by our Gaussian free field, we have a harmonic component on [0, t], which is the linear function connecting 
(0, Bo) and (t, By), plus the remaining part which starts and ends at zero. These two parts are independent of each 


other, so we can “force the motion to return to zero at time T” by sampling the remaining part as 


t 
By = B.- 5+ Br 
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(this is what we call a Brownian bridge). We can then get a two-dimensional version of this by running two independent 
Brownian bridges in the x- and y-coordinates, and that gives us a Brownian loop. 

There turns out to be a natural measure on the set of loops in the plane, which is essentially zdxde (where dx 
is Lebesgue measure and dé dictates the measure of Brownian loops of length 7). In other words, we assign a special 
point x and then pick a rooted loop that starts and ends at x. (Alternatively, we can have a measure dé on unrooted 
loops, which changes our measure by an additional factor of T.) 

If we now take a Poisson point process with respect to that Brownian loop measure (meaning that the number of 
events in any set A is Poisson with mean cu(A), and the number of events in disjoint regions is independent), we get 
a collection of loops, and that gives us our Brownian loop-soup. (It’s okay that our measure is infinite here — there 
are finitely many loops in any finite region of interest. But sometimes we only look at loops that live in some domain 


D as well.) 


Fact 31 


Searching up “Brownian loop soups” on Google gives us results about “loop-soup clusters” and their boundaries. It 


turns out that “loop-soup cluster boundaries” within some domain D have some interesting properties — multiplying 
the Poisson process intensity c by a constant leads us to conformal loop ensembles, which are random collections 


of loops with conformal symmetries. 


5 September 23, 2021 


Today, we'll talk more about the “fundamental miracle’ coming in the connection between loop-soup occupation 
measure and the square of the Gaussian free field. To understand it, we'll start by looking at some lower-dimensional 


analogs through random variables: 


- First, let's review the gamma random variable. Recall that a gamma random variable with density f,(x) = 
ad is what we get if we add up k iid exponential random variables (the e~* comes from the individual e*‘s, 
and the rest comes from the volume of the (k — 1)-dimensional simplex formed by the points where }> x; = x. 
(By the central limit theorem, this means that gamma random variables with large k look approximately normal. 
And we can define this distribution for any real number k > 0, and the sum of two gamma random variables 
of the same rate is another gamma random variable as well — this means we have a random variable which is 


infinitely divisible.) 


+ Since the exponential random variable is a continuous version of the geometric random variable (which we can 
imagine as counting the number of coin flips required before the first head comes up), it makes sense that there’s 
something related in the same way with the gamma distribution, and that’s known as the negative binomial 
distribution — It's defined as the number of flips of a coin needed before we get k heads. But then we can 
extend the definition for k not necessarily an integer in the same way: since the probability mass function looks 
like 


x-1 v2 (x — 1)! _ 
feats) = (7 3) py tot = Oa — pth 


(where p is the probability of a head and k is the number of heads we want to see), and this definition involves 


factorials just like the gamma density does, we can define a negative binomial random variable with parameter k 


even when k is an arbitrary real number. 


« Next, we can discuss the chi-square distribution, which comes up when we sum up k copies of the squares of 
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random variables: it has density 


k/2—1 9-x/2_ 


f(x) = x e 


2k/27(k/2) 
We can notice that this is pretty similar in form to what we had above: in particular, when k = 2, we get an 
exponential random variable, and that can be also understood through the fact that de b+y")/2 is the joint 


density of two iid normal random variables and we can integrate P(X* + Y? < c) in polar coordinates explicitly. 


So we can now look at the square of the Gaussian free field again: notice that if we take two copies of the Gaussian 
free field, square them, and add them together, this is kind of like a chi-squared distribution with k = 2. After all, 
at any given point, the sum of two squared Gaussians of some variance (the Green's function at that point) will be 
some exponential random variable. 

But now if we play the Wilson's algorithm game with the Green's function, asking how many times we expect to 
return to the point x that we started, we can calculate this by thinking about how many loops we form — if we know 
the probability of forming a loop before hitting the boundary of our domain, then we can play the geometric random 
variable game from there to evaluate Gp(x, x). If our Wilson’s algorithm runs in continuous time, we essentially add 
together exponential random variables instead — that’s the connection with the loop-soup measure, because a Poisson 
point process takes an exponential amount of time between events. 

We can now use the fact that the Poisson point process of loops is infinitely divisible, because of the way that 
Poisson processes are defined. This means that if we can make the connection between the loop-soup and the sum 
of two squared GFFs, we can just reduce the intensity by a factor of two and get something related to just a single 
squared GFF. (And remember that with Laplace transforms, adding multiple copies of an independent random variable 


is very natural, because we can write down things like 


ife CG +Xo+Xa)] — i[e cae fe tX2] fe tX3) ) 


Remark 32. Note that these lectures are not meant to replace the lecture notes that are assigned as reading: they 
are meant as supplementary material, and we should still make sure to do the detailed line-by-line checking on our 


own. But for this particular instance, we're going to do a bit of more detailed checking together during lecture. 


One important rigorous detail we should keep in mind is the definition of rooted versus unrooted loops: a rooted 
loop (£9,--- ,2£m—1) iS a nearest-neighbor sequence in our domain D where é9 and £,-1 are adjacent and connected 
at the end, while an unrooted loop is an equivalence class of all such circular relabelings. We'll let m be the number 


of steps in the loop @ — let’s define jg(x) to be the number of times that our loop @ hits the point x. 


Definition 33 


(2d)=# 


AO to a loop 2 in D that is rooted at x. 


The rooted loop measure 5 assigns a probability mass 


(The (2d)~* factor here comes from the actual probability of seeing the loop, and we need an additional j(x) 
factor to compensate for the “number of total ways this loop could have been chosen to be rooted at x.” °) Towards 
defining the unrooted loop measure now, if we have an unrooted loop (equivalence class) L, we let J(L) be the number 


of “repeats” that the loop makes in a row — this is just for technical reasons. 


Definition 34 


(2a)-It! 


(oe to each loop L that is inside the domain D. 


The unrooted loop measure {4p assigns a probability mass 


If we're not so happy about this additional factor, we can also imagine that we have a much finer mesh, and on each step of our 
loop, we move a distance at most d instead of having to go to a neighbor. Then that factor je(x) will become overwhelmingly closer to 1 
everywhere. 
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We can check that these two measures are defined so that one is the image measure of the other when we map 
rooted loops to unrooted loops. And now that we've defined the measures on loops more rigorously than last lecture, 


we can also define the loop-soups more rigorously: 


Definition 35 
Let D be a domain and let a@ > 0. A discrete-loop soup with intensity a is a Poisson point process of intensity 


Qktp (yielding unrooted loops). 


Lemma 36 
Let Mp, be the set of loops in D that visit x. We have the relation 


= exp (—Uo(Mp,x))- 


1 
Gp(x, x) 


This is relevant for us because a Poisson random variable X ~ Pois(A) satisfies 


eo 
P(X =k)= a Px=H=2"%, 


so we can use this lemma to find the probability that none of our loops in the loop-soup hit x. 


Proof. Let U be the sum of (2d) over the set of rooted loops @ starting and ending at x which only hit x once. 
Then (because the exponential terms combine), U" is the sum of (2d)~!4! over the set of rooted loops £ that hit x 
exactly n times during the loop, since we independently pick one of the loops n times in a row. Evaluating, we find 
that yu 43 

Ho(Mox) =U+t > +z te = In — U), 


where the denominators come from the je(x) term that we put into the rooted measure initially. Therefore the 


expression on the right-hand side Is just 1 — U. 
But on the other hand, Gp(x, x) is the number of expected visits to x before reaching the boundary during a 


random walk, so summing over the probabilities of each possibility yields 


Gp(x, x) = P(number of visits > 1) + P(number of visits > 2) +--- =1+U4U?+4---= 


so the left-hand side is also 1 — U. 


Since the Green’s function Gp(x, x) tells us something related to the density of the GFF being near zero at a given 


oint x (which we explained earlier is ——4—), we can get the following result: 
p ( p Jace g g 


Proposition 37 


The probability that the loop-soup (under the above measure) contains no loops is Seca 


Proof. Recall that we can construct our GFF one step at a time conditioned on the previous values, and similarly we 
can check that a loop-soup contains no loops if and only if it doesn’t have loops through x;, then (conditioned on 


loops staying in D\ {x,}) it doesn’t have any loops through x2, and so on. Then the determinant formula for Gp gives 


us the result. 
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In particular, the partition function for the loop soup (when we normalize so that the mass of the empty loop is 1) 
has something to do with this determinant. And we'll see this when we discuss random graphs and random surfaces 
with loop-soups — we'll weight by the determinant of the Laplacian (remembering that (det A)(det Gp) = —1) 
raised to some power, so that when we condition on particular graphs, we get the loop-soup measure. This helps 
us make a connection from discrete triangulations and surfaces to continuum random surfaces (Liouville quantum 
gravity is the key term), but the proofs of the convergence is surprisingly tricky. 

With that, we'll return again to local sets in more detail. Suppose we have a domain D and some subset B of it, 
so that we can define the fields Pg (the GFF [ on B and harmonic on D \ B) and [®? =f —Tg. This is the idea of 
splitting up the GFF into its harmonic part and its fluctuation. 


Definition 38 


A set A is local if (Tg, {A = B}) is independent of F?. 


In other words, if we're told the value of ['g and want to guess whether A = B, we don’t gain any more information 
about that if we're told the “fluctuation” values of [®. There are some weird examples that we should keep in mind — 


for example, A doesn't necessarily have to be algorithmically local. 


Example 39 

Consider a free field on three independent points (so the GFF is just independent random variables). Sample all 
three Gaussians and look at the signs on those variables — if the signs agree, let A = @, and otherwise, let A be a 
subset of two of the three points which have different signs. Then if we're told the values of [ on two points, we 
can look at their signs — if they're different, then there's a 50-50 chance that A is those two points, and knowing 
the third sign does not change that probability. Thus A is local. However, there’s no way of observing the values 


of the GFF one at a time to get this situation to happen, so this set is not algorithmically local. 


6 September 28, 2021 


We'll begin by reviewing Gaussian free field stories in some simple cases to make sure we understand how everything 
works, and then we'll look at the other extreme with the continuum setting (which will get us to tempered distributions, 


Schwartz space, and so on). 


Example 40 
Recall that we previously considered a very simple domain D on which a discrete Gaussian free field was defined, 


where we have two interior vertices (perhaps at x = (0,0) and y = (1,0)) which are connected by an edge, and 


where we have zero boundary conditions at the “boundary points” (0,41), (1,41), (—1,0), and (2, 0). 


The Gaussian free field is then just dictated by y, = A and yy = B, the values at the two vertices that we have. 


We found last time that the Green's function can be represented with the matrix 


160 4 

_ | 15 
Gp = 4 46\° 

15 15 


Inverting this matrix yields 


and this matrix is supposed to be the Laplacian Ap. Indeed, the Laplacian at x is the average of its neighbors minus 
x, which is —A + 8 and the Laplacian at y is very similarly —B + 4. 

But now, let’s try to count the number of spanning trees of our graph — here, all of the six boundary points 
are combined together into a single “multi-vertex” v, so our graph has three vertices x, y,v. We want to use the 
matrix-tree theorem, which requires us to use a slightly different Laplacian matrix scaling: we instead have (letting the 


rows correspond to x, y, v in that order) 


q 7. 3 
A=|-1 4 -3 
=3. £8 6 


The matrix-tree theorem then tells us that the number of spanning trees should be 21 -++py-1, OF 3A1A2 in this case. 
Indeed, the eigenvalues are 9, 5,0, and the determinant of the top-left 2 x 2 cofactor matrix is also 15, and these facts 
both tell us that there are 15 spanning trees in this graph. Indeed, if we connect x and y, there are 6 ways to connect 
to v and complete the tree, and otherwise there are 3* = 9 ways to connect each of x and y to the boundary, yielding 
6+ 9 = 15 overall. 


Remark 41. Remember that the reason we only have one zero eigenvalue is that if we have a connected domain, the 


functions with zero Laplacian form a one-dimensional subspace (the set of constant functions). 


Example 42 


We can now complicate matters a bit by adding a third vertex z at (2,0) to our domain and try to do the same 


computations as before, computing the Laplacian matrix and the number of spanning trees. 


The matrix-tree cofactor that we want to compute can then be the one involving the three vertices in D, and thus 


the number of spanning trees here will be 


This can be verified by checking the edges e; between (0,0) and (1,0) and e) between (1,0) and (2, 0): 


- If neither e; nor €> exist in our tree, then there are 3-2-3 = 18 ways to connect x, y, Zz to v. 


- If e, exists but e2 does not, then there are 5-3 = 15 ways to connect z to v and also either x or y to v. The 


same argument yields 15 ways if éo exists but e, does not. 
* If e,, @2 both exist, then there are 8 ways to connect the combined x, y, z to v. 


+ Indeed, 18+15+15+8=56. 


We can now draw connections back to other properties of the Gaussian free field that we've been talking about, 


like the density function. When our domain has just the two points x, y, the probability density for the GFF is 


Jaet(—Ao) wp (£200 VIGTIS op ( (AR BY S389) aaae, 


dna soda 
(2n)r/2 iy ae 8 


where d = 2 is the dimension of our lattice, n = 2 is the number of points we have, and Ep(7) is our Dirichlet 


energy. We can then answer questions like the law of (A?, B?), or the expected value E ler | (giving us the 


corresponding Laplace transform). And because the density is also the exponential of a quadratic function in A and 
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B, we'll just end up with a ratio of two determinants, just like we did in more generality earlier on in the class. So this 
is an example to keep in mind when we're stepping through the notes and thinking about the loop-soup occupation 


measures! 


Fact 43 


One idea brought up in the notes is that we can “pass to the continuum limit” by replacing each edge in our lattice 


with many small edges, rather than making the entire mesh smaller. (This idea is due to Titus Lupu.) 


This is called a “cable graph,” and it has the property that between every set of endpoints, we have a Brownian 
bridge with endpoints given by the free field values. (And we can check that the Green's function focused on the main 


vertices will be the same as before, but the walks can now make small loops along each edge.) 


Remark 44. This cable graph is sort of an “intermediate case” between the discrete graph and the full continuum 
GFF, and the reason we might want to consider it is that we can look at the (random) collection of points where the 
square of the free field is zero. We then might be curious about this zero set in the limit, looking at the connected 
components that are formed. In particular, we can recover the GFF by independently deciding whether each connected 
component is positive or negative, but that idea is much murkier if we're on the two-dimensional mesh instead of the 


cable graph. 


With this, we're ready to jump into the continuum world, and we'll start with a bit of analysis. 


¢ The Schwartz space of functions is the set of infinitely differentiable functions f such that f and all of its 
derivatives decay faster than polynomially at infinity. The reason this is a nice space is that the Schwartz space 
is preserved under Fourier transforms, since multiplication by polynomials and differentiation behave nicely on 


“both sides” (the usual and Fourier spaces). 


* The L? space is a larger space than the Schwartz space, but what's nice is that the Fourier transform is an 


isometry on L? (up to 2m factors), so L? is also preserved. 


+ We can now look at an inner product (¢,h) = { $(z)h(z)dz — if we try to construct a “dual” to the Schwartz 
space by requiring that the inner product contains one element of the Schwartz space and one element in its 
dual, then we get the set of tempered distributions. (An example of a tempered distribution is the “zero-width, 
infinite-height” delta function 6,, defined via (¢, 6x) = @(x).) Then the set of tempered distributions is also 


preserved by Fourier transforms, because we can define the functional fA via the equation (¢, h) = (¢, h). 


We can use this setup to talk about white noise, which is (formally) a random centered Gaussian w (which takes in 
a function @ and outputs a number) given by Var(¢, w) = |@|3. To actually understand what's going on, we're saying 
that we divide our domain into very small pieces and put a small Gaussian in each one, independently and normalized 
so that we get a signed random measure. Then we do the usual inner product integration by integrating @ against this 
signed measure. 

In order to check that this is fully well-defined, we need to write down the covariance matrix and evaluate 


Cov((¢1, W), (¢2, w)). But we already know this, because 


I|b1 + b2||5 = (b1 + b2, br + b2) = (1, O1) + 2( G1, G2) + (h2, b2) 


(in other words, knowing the inner products of elements with themselves in a Hilbert space tells us all of the inner 


products), and thus | Cov((¢1, w), (¢2, w)) = (¢1, do) | and in fact we have a Hilbert space norm here. And if we take 


the Fourier transform of a complex-valued white noise (which is a tempered distribution), we will also get another 
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white noise, because that replaces the equation Var(¢, w) = lols with basically the same thing with hats on @ and w. 
(And in particular, notice that white noise is a random element in the set of tempered distributions. ) 

We can now look at the Laplacian of white noise, which we can define in a weak sense — that will still be a tempered 
distribution, because we can use integration by parts to get a definition. Specifically, if we want the derivative of some 


white noise w, we have (imagining our white noise is in two dimensions) 


(5.0) = f Sle. r)6x Yardy = = [ wixZdluy)dxdy = (w, x) 


by integration by parts, where we use the fact that ¢ vanishes at infinity because it’s a Schwartz function. So we can 


do the same argument with x and also eo, and we'll find that the Laplacian of white-noise Is 


(Aw, o) = (w, Ag). 


But we also care about the inverse Laplacian in the discrete Gaussian free field — since taking the Laplacian is like 
multiplying by (x* + y?) in Fourier space, taking the inverse Laplacian should be like multiplying by Oty) which is 
itself a tempered distribution. In general, we can't necessarily multiply distributions together, so this doesn't give us 
a formal definition. But this is where the Gaussian free field is coming from: A~!/2w is what we want our continuum 
GFF to be. 


Remark 45. The analogous thing to think about in one dimension is that the integral of white noise is Brownian 
motion, because white noise is essentially a bunch of random increments and Brownian motion is the sum of those 
random increments in the limit. And we can also scale systems like percolation in an appropriate way so that we end 


up with Gaussian random variables in the limit, and that’s also going to be white noise. 


7 September 30, 2021 


We'll discuss local sets and related topics some more today. First of all, we'll be using the notation 
I-(f) = (0, f) | r(x) F(x) dx 
D 


from Chapter 3 of the lecture notes for a continuous GFF [, where the ? is meant to indicate that T is a generalized 
function rather than an actual function. Recal Ithat if we have the Dirichlet inner product (f, g)v = Jp VF(x):-Vg(x)dx, 
meaning that (f, f)y = ||V#FI|5 is the square of the L? norm of the gradient, then we can define the GFF via 


P=) aif 


where a; are iid standard normal random variables and f; are elements of an orthonormal basis of H(D) (the Hilbert 
space closure of the set of test functions on D) under the Dirichlet inner product. We've discussed frequently that 
(f,9)v = (f, —Ag) by integration by parts, but we can also write this in another important way: if we let op = —Ag, 
then 


(9, 9)v = (Ap, -A* p)y = (—Ap, p) = — f o()d*p(x)a 


The Green’s function helped us deal with the inverse Laplacian in the discrete case, and similarly here we are looking 
for a function @ such that —A@ = 69 (so that we also have a function which is harmonic everywhere except at a 
single point, and then taking an average or integral over different 6, translations would get us what we want). And 
the idea is to think of this physically: a gravitational potential for a Newtonian point mass satisfies this if our domain 


D is circular or spherical, so in dimension 2 we want the function = log |x — y|, and in dimensions d 4 2 we want 
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|x — y|?-7. (We can check that these functions, as a function of x, are both harmonic away from y.) 

From here, the next idea is that if we have a harmonic function @ and a Brownian motion B;, then @(B;) is a local 
martingale (and in fact a martingale if we stop it at a bounded value). This is then useful for making statements like 
“the expected value of @ when the Brownian motion hits a sphere or annulus containing our starting point is the same 


as the value of ¢ we started with.” 


Fact 46 
If our domain D is not a nice shape like a circle, but we still want to find a function @ such that —A@ = do, then 


we Can Use 


Gp(x, y) = —log |x — y| — (the harmonic extension of — log|x — y| from OD to D) 


as our Green's function instead. 


And this function Gp(x, y) is, much like our random walk, the expected amount of time of a Brownian motion 
started at y that is spent at x before hitting the boundary. (But we have to make sense of it with a limiting behavior 


too.) So the Green's function can now be used to invert the Laplacian that we were working with earlier, and we have 


(Ag49)(x) = a Go(x, y)g(y)dy | 


Returning to our equation from before, where p = —Ag, we find that 
(a.ae=— f atoedx= ff oraonelx.naxay. 
More generally, we have that 


Cae I / pr(x)p0(y)G(x, y)dy = Cov(Ir(o1), Ir(p2)). 


This should make sense, because If 01, 02 are just delta functions, we're saying that the covariance at two points Is the 
Green's function. (But /r(p) is not actually defined if p is a delta function, so we shouldn't take that too literally.) In 
particular, the Dirichlet energy corresponding to the function Gp(x, y) in two dimensions is actually infinite, but it’s 
zero on any annulus not containing the origin x = y (because we have a harmonic function, but also because in two 
dimensions the Dirichlet energy is scale-invariant and in fact invariant under conformal transformations). The way we 


actually calculate the Dirichlet energy if we do contain the origin would be that 


[ -(lloax-yp? = f —aeaz = f Far 


by switching to polar coordinates, and that tells us that if we replace —log|x — y| by a constant “plateau” inside a 
small ball of radius r, we have a finite Dirichlet energy (because the 4 no longer blows up), and the Laplacian of that 
new function is like a delta function but on the circle of radius r. Formally, if [ is a GFF, we define [,(x) to be the 
mean of F on the boundary of a ball B,(x). Now even though [ is not defined at individual points, [, will be almost 


surely defined (because the Dirichlet energy is now finite). We'll expand more on that “almost surely” point below: 


Definition 47 


A Gaussian Hilbert space is a space where all elements are centered Gaussian random variables, and the inner 


product is the covariance of those variables. 
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For any random variable f in a general Hilbert space, we can then index it into a Gaussian Hilbert space with the 


random variable R(f) = (T, f)v, and we then want 


Cov(R(f), R(g)) = (Ff, g)v- 


We can then define a distance function d(f, g) = Var(f — g), and this will in fact be the same as the original distance 
on our original Hilbert space. That's what's actually going on when we define the Gaussian free field as T = > ajff, 


because if we have g = )— Gif; for some fixed constants G;, then 


Var(l, g)v = Var (~ 016) = > 6? =(9,9)v. 


So if we take any g in the Hilbert space with Be finite, we get a (T, g)v with finite variance. But remember that 
when we define [ = >> ajfi, we have >> a? almost surely infinite, and to be in the Hilbert space we need the sum to be 
finite! So this standard Gaussian element is not actually in the Hilbert space, but given some values a; we always know 
what's going on with (fF, g)yv with probability 1. On the other hand, we cannot actually have a well-defined (T, g)v 
for all g simultaneously — for example, picking 6; = san(ai) will give us something infinite. So F is not a proper Hilbert 
space element, but it's “honorary” in the sense that we can almost surely take the inner product of [ with any fixed g. 

However, [ = )>ajf; does converge in the sense of distributions — checking that convergence requires us to 
integrate it against smooth functions, and smooth functions have 6;s decaying fast enough (for example, consider the 


Fourier modes on a torus). In particular, that means that (T, g)v is almost surely well-defined for all smooth g. 


Remark 48. /n electrostatics, the expression 


J [eerpaoete yardy 
DJD 


is known as the potential energy of assembly — the Green’s function tells us how much energy it takes to pull two 
charges apart under the electrostatic potential, and this energy tells us the energy to go from one configuration to 


another. 


We'll now recall the way that we proved that Brownian motion is continuous: we can write down the value of B; 
at any finite collection of points and it’s perfectly well-defined because we know its covariance. But now we want 
to see if this is actually a continuous random function — we do this by extending to a countable set (namely, the 


dyadic rationals), and then we want to show that the limit to any real number exists almost surely, which we do using 


Kolmogorov's continuity theorem (which basically requires a condition of the form E [d(X;, X;)*] < K|t — s|!*® for 


all s,t). This strategy in fact works for the function [-(x) — we can use the same argument to show that this is 
continuous in both € and x (bounded away from ¢€ = 0). 

Furthermore, if we consider a process given by B;(x) = Fe-*(x) (so the circle around x gets smaller and smaller), 
the variance of B; will be t, because we've mentioned that it has to do with the log of the radius of our circle in 
two dimensions. And then if we check some covariance properties, it also makes sense that we have Cov(B,, By) = 
min(s, t). So this is actually just Brownian motion, and we'll be able to explore this some more later — for example, 
we can find certain specific x € D such that B(x) will be positive for all time, even though Brownian motion does 


not stay positive almost surely, and those x will exhibit some interesting behavior. 
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Fact 49 


The construction [ = >> ajf; works well in one dimension, because then we have things in L? because f, are 


orthonormal under the Dirichlet inner product rather than the usual L2 norm. But now we can look at the spaces 
A*L? (where we apply some power of the Laplacian), and the idea is that if we apply a negative enough s, our 
functions will look nicer. And if the fs form an orthonormal basis in some of these ASL? spaces, we get the 


fractional Gaussian fields — this is where fractional Brownian motion comes from. 


8 October 5, 2021 


We'll return to the lectures on “local sets of the Gaussian free field” in further detail now. We've already defined 
the standard Gaussian and the discrete Gaussian free field (the latter of which uses the Dirichlet form (f,g)v = 
Dixay(F(x) — Fly))(9(x) — g(y)) to define the density using H(f) = (f, f)v). We've also previously discussed the 
effect of boundary conditions on this density, and we've mentioned the Markov property for successively determining 
values of the GFF for a discrete graph. 

From there, we did the continuum free field case by defining h = >> a;f, again, but this time using the Dirichlet 
inner product 

(A Av = [ova -Vh)dxdy, 


where f;, f are elements of H(D), the Hilbert space closure of the set of smooth, compactly-supported functions. 
So for example, the function |x|—1 on the unit disk does count even though it’s not smooth, because we can round out 
the problem parts of the function and take a limit, but something like — log |x| has well-defined but infinite Dirichlet 
norm, so anything in the set of smooth compactly-supported functions is far away from it and we can't mollify the 
function. (Being more specific would get us into technical details like defining functions as equivalence classes up to a 
set of measure zero.) And for another example, the function which is 1 on a disk centered at the origin and O outside 


is also not in H(D), because the gradient needs to jump from 0 to 1 and it does so in a way that goes to oo. 


Fact 50 


It turns out that if we project the continuous GFF onto the set of functions that are piecewise linear on lattice 


triangles (that is, tiling the lattice grid further by drawing diagonals), we will get the discrete Gaussian free field 


times a lattice-dependent constant. 


The idea here is that we are just taking a finite-dimensional subspace of all functions that exist in the GFF, but 
we're still dealing with the same linear combination business 5° a;f;. For example, consider the function f which is a 
linear interpolation of on a right triangle with value @ at the right angle and G, y at the other two vertices. If we try to 
integrate the squared gradient of the f, we're integrating (G — a)* + (y — a)? across the whole triangle, which yields 
5(6 —a)*+ $(y —a)*. But then summing this up across all of the different triangles (every term shows up twice) 
gives us exactly the Dirichlet energy, so the L? norm of the gradient of a piecewise linear function is a discrete 
L? norm! 

In summary, there are two ways to “tame” the Gaussian free field — we can considered h,(z), which averages the 
value of h on a boundary of radius € around z, or we can just look at a triangulation and get a good approximation too. 
(And remember that we can write down a measure e732 JS dxdyG(xy) dp, where the exponent is the energy of assembly 
of a charge density. But making sense of what do means can be difficult in general.) 


With that discussion, we're now ready to vaguely characterize the Schramm-Loewner evolution (SLE): 
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Definition 51 


Let D be a domain and a, b be two points on the boundary. The SLE,, curve is a (non-intersecting) random curve 


7 starting at a and ends at b which is conformally equivalent (up to time-change). We also require that the 


curve satisfies the Markov property, meaning that if the curve is given up to some stopping time 7, then the 


conditional law on the rest of y is the SLE, curve on the complement of the existing curve. 


The Gaussian free field is conformally equivalent, so it makes sense that the SLE curve that comes out of the 
hexagonal lattice is also conformally equivalent. In particular, if we define a GFF with negative boundary conditions 
on one side and positive boundary conditions on the other, and we only observe its values on a triangular lattice and 
linearly interpolate in between, then the zero level sets (paths on which the free field is negative on one side and 
positive on the other) trace out a boundary. And then the expectations on the two sides of that level set, given the 


interface curve, will be slightly different. 


Remark 52. What's going on here is actually special to dimension 2: notice that if we have a 1-dimensional Brownian 
motion and want to scale it horizontally by n, we need to divide by \/n. But no such rescaling has to happen in 2 
dimensions, because the original Gaussian free field is being projected onto finer lattices, and we get a Dirichlet energy 


that is constant if we rescale (since area scales in the same factor as the squared gradient). 


Essentially, what we're saying is that as we have a finer and finer mesh, the expected value of the GFF approaches 
a negative constant on one side and a positive constant on the other. That leads us to a property of SLE, (which we 


still haven't properly defined): 


Proposition 53 


Let D be a domain, and let a, b be two points on the boundary. Suppose we draw SLE, starting at a up to some 


stopping time T (such as hitting a ball centered at a). Then the probability that the curve keeps z on its left 


on its way from a to b is the same as the probability that a Brownian motion started at z hits the left part of 
OD U4(([0, T]) before the right part. 


This should make sense with our hexagonal lattice local set: if we imagine that the left side starts with boundary 
condition 0 and the right side starts with boundary condition 1, then the harmonic extension A(z) given the curve 
7([0, T]) also gives the probability that a Brownian motion started at z hits the right side (because harmonic functions 
lead to Brownian motion martingales). And then being on the left or the right of the curve relates to the fine mesh 
positive and negative constants on the two sides, because when we finish drawing the curve from a to b, we'll have 
approximately 0 on the left and 1 on the right. (But for more details, we should read about the “height gap” lemma.) 

We can now return to the random walk on the hexagonal lattice from the first day’s lecture — instead of having 
to sample from the continuous GFF for each hexagon to decide whether we turn left or right, we can start a random 
walk on that hexagon and see whether we hit a “left” or “right” hexagon first, and then we have a martingale in the 


“harmonic explorer” (which we can search up on Google). 


Remark 54. When we think about level sets with modified boundary conditions +@ as on our problem set, we should 
keep in mind that h and h+ @¢ will be absolutely continuous to each other. So if our zero set almost surely converges 
to a random curve, then the same statement will be true with h+ ¢, and that allows us to argue that an SLE curve 
is unlikely to reach a large peak within SLE. So we should worry less about the well-defined details — those come up 


more in the papers that were written by Professor Sheffield in other cases. 
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9 October 7, 2021 


We'll talk today a bit more about SLE curves — we mentioned last time that they must satisfy certain consistency 
properties (conformal invariance and the Markov property), but we'll define them more carefully now. In particular, 
it’s enough to just define SLE on a single simply-connected domain D (because the Riemann mapping theorem lets 


us map D to any other such domain), and it turns out the nicest one to do so on is the upper half plane 
H = {z€C: Im(z) > 0}. 


It is then convenient to choose our starting and stopping points a,b € H to be O and /oo, so our SLE curves will 
start at the origin and move upward in the half-plane. But we have to figure out how to parameterize this curve: with 
something like Brownian motion, we could use the quadratic variation, which we'll quickly review first. Intuitively, the 
quadratic variation is kept track of by counting the number of times a Brownian motion By; Jumps from one multiple 
of € to another in a fixed time t € [0, 1],” and this scales as 5° So as € > 0, €? times the number of €-crossings in 
1 unit of time will converge to 1, and that gives us a way to keep track of the time of the Brownian motion, as 
long as we can keep track of distance. Similarly, if we have a two-dimensional Brownian motion, we can just project 
down onto a single dimension and work out the time in the same way. (And if s is an increasing function of t, the 
process Bg) iS just a time-change of a Brownian motion — this allows us to parameterize many different processes. ) 

But what we have here is an SLE curve, and there is actually a more convenient way to parameterize that. 
Specifically, let s be some large real number, and consider a two-dimensional Brownian motion By that starts at is 


and stops at time 7, either when it hits our curve or the boundary of the half-plane. 


Definition 55 
Suppose our SLE curve is only run for some finite time and has image K in the upper half-plane. Then K is a hull, 
meaning that its complement is a simply-connected domain, and for any such hull we can define the (half-plane) 


Capacity 


hcap(K) = jim. sE[Im B'S]. 


To understand this, recall that if we start a Brownian motion at the point / (at height 1) until we hit the real 
axis, its exit position is distributed as a Cauchy random variable with density aOeTI: Then the probability of hitting 
between 0 and é is basically proportional to €, but this whole process is scale-invariant! So starting at height : gives 
us a probability € of hitting a constant-width interval on the real axis. If we then accept that hitting a constant-width 
interval and hitting a constant-height SLE curve are on the same order of probability, it makes sense that as s goes to 
infinity, we'll have an expectation on the order of 2 for Im BIS. 

We'll then use hcap as our notion of time: we'll say that the curve is parameterized at time t If it has capacity tf. 
But then we need to check things like “if we add more to the curve, then the capacity is increasing” to check that we 
have a reasonable notion of time. For that, if 7 is a curve in the upper-half plane parameterized from 0 to t, then 
we can let g: : H\ n([0, t]) + H be a conformal map. But this conformal map is not unique — if we further require 
that co maps to oo, then we still have two degrees of freedom (translation and scaling). So we'll put the following 


constraints on our map gz: we additionally ask for it to “look like the identity near infinity” via 


| Ha (ge(Z) — Z) =0, 


SIf our process were a smooth curve, we'd have something that scales as A instead. 
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which requires oo to map to oo, and it also removes the scale and translation factors (because we can’t add or multiply 
things to gz). Next, another function f can be defined similarly, but now requiring f(z) ~ z as Zz > oo and that 
the tip of the curve goes to the origin (this basically flattens 7([0, t]) onto some connected segment of the real line, 
and it tells us about the probability of hitting the curve more explicitly). These two functions then just differ by some 
function W (which essentially tells us how far to the left or right the curve moves, in a conformal sense), and both 
f, and g; have Laurent expansions near oo because of our given constraints — the next lower-order term, Z, in the 


Laurent expansions turns out to be related to the capacity. 


Remark 56. Notice that if our SLE curve is very flat and moves horizontally instead of vertically, then the capacity 
is relatively small but the curve has high diameter. So this quantity hcap cares most about vertical movement of the 


curve. 


Notice also that “capacities add,” because when we have two functions ¢(z) = z+ 2 + o(|z|~+) and p(z) = 
Z+2+40(|z|)~1), then pog(z) = z+ 4 + o(|z|7+) just by expanding out some of the terms. So if we have two sets 
K, kK’ sitting on top of each other, and J = KU kK’ is our combined hull, then we can look at a conformal map gx(K) 
under which K gets sent to a part of the real line and K gets sent to some new hull. We then might be interested 
how hcap(J) looks relative to hcap(K) and hcap(K’) — it turns out that 


hcap(J) = hcap(K) + hcap(gx(K)), 


because the capacity is the + coefficient of the Laurent series, and composing the two conformal maps (flattening K, 
then flattening K) adds those coefficients. 


The next result takes our discussion about the function W; and tells us about the conformal map g¢: 


Theorem 57 (Loewner’s theorem) 


We have the ordinary differential equation 


At time t = 0, we know that g; is the identity map, so go(z) = z, and we know that Wo = 0. So the derivative 
looks like 2 at time 0, and that gives us a vector field which basically fans downward and outward to the negative- and 
positive-real axis, except the imaginary axis. (So if Wo = 0 for all time, then we flow according to that vector field, 


and the corresponding curve moves directly upward.) 


Remark 58. Notice that if we multiply a hull K by some scale factor c, the capacity is scaled by c* (because we 
double the probability of hitting the set instead of the real axis, and we also double the expected height that we hit 


at). This means it takes longer and longer time to reach a higher height for our SLE curve. 


So one lesson to learn is that if we now look at increments [t, t + 1] of our SLE curve, they aren’t completely 
independent (because SLE can’t intersect itself), but if we conformally map the complement of 7([0, 1]) to all of H, 
then the Markov property tells us that image of the curve increment 7([1, 2]) is independent of 7([0, 1]), and so on 
(meaning that the W; increments are also independent). And this happens in such a way that after time T?, the height 


of the SLE curve is about T in the original domain. 


25 


Fact 59 


Intuitively, adding a small tip to the top of our SLE curve will affect the capacity more than the same piece added 


to the bottom of our curve — this also manifests in the fact that the map f(z) = Vz2+1 is a conformal map 
from HI \ [0, /] > H. 


Continuing on this logic, we can also derive that scaling this system up spatially by a factor of a makes time run a 
quadratic factor faster, and therefore we have W equal in law to 2Wezr. But this scaling symmetry and independent- 


increments property is only satisfied by a constant times Brownian motion! So this gets us to our definition: if 


W, = Bet | (which is equivalent in law to \/KB;), then a larger « basically gives us more wiggling in our driving factor 


W,, which makes gz wiggle more. 


Fact 60 
It turns out that if & is between O and 4, then g; is a simple curve almost surely (it doesn’t intersect), and the 
dimension of the curve is 1+ } with probability 1. Then when 4 < k < 8, we get a curve that hits itself but 


doesn’t intersect (in this case we swallow any pieces of the half-plane that are closed off and include them as part 


of our hull), and when & > 8, we get a space-filling curve. 


10 October 12, 2021 


We'll start today with a short overview of some stochastic calculus concepts before applying them to the SLE 
differential equation. The idea with stochastic calculus is to generalize our ordinary notion of calculus to random 
processes, particularly those related to Brownian motion. 

As a starting point, let B, be a Brownian motion. Then the “derivative” of Brownian motion, dB:, is white noise, 
but it’s not the same notion as something like X(t) = t? = dX(t) = 2tdt. The idea is that we can integrate 


against our “dt,” so we should similarly be able to integrate expressions that look like 


[ F(t) dB. 


We can start simple: if f(t) = c, then we should have je cdB, = cB; for any real number c. And similarly, if we put 
a piecewise constant function f(t), we can break up the integral and evaluate it on each interval to get a sum that 
looks like 5°; c)(Bz,,, — Bz). (One way to think about this is to think of By as a stock price, and then f(t) is the 
number of shares that we have invested at any given time.) 

From there, we can approximate continuous functions with piecewise constants, as long as our approximations are 
converging in L?, the variance of the difference between our approximation and the limit converges to 0. So this allows 
us to define these types of integrals against Brownian motion, and in fact we can even make sense of this when our 
integrand f(t) is a function of the Brownian motion up until time tf. 

Now that we have integration, we can think about how to define the derivative of something like B?: can we find 
a process X; so that [Xrdt = Be or [Xrdt + {[%:dB: a Ba It might feel like we can write down something like 
dB? = 2B,dB,, but there's a bit of subtlety here. We defined integration in a way that ensures that the “amount of 
money we have” in our dB; integral is always a martingale (because we really defined it explicitly only with piecewise 
continuous functions, and a constant times Brownian motion is always a martingale), but Be is not a martingale (the 


expectation of B? is t and the issue is essentially convexity of the function x*). So we have a drift term of 1 here, 
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and that’s essentially what motivates the equation 
dB? = 2B,dB; + dt 


(because now we have the linear upward drift that we need). It turns out this is actually correct — the dt term always 
gives us the expected drift. So now if we look at B? — t, we know that this expression has expectation 0 at all time 
t, but that’s not enough for it to be a martingale: we need that the conditional value of B? — t, given everything 
we know up to time s (for any s < f) is Be —s. But that can be shown as well just by considering B, — B, as an 
independent increment of everything up to time s, and eventually that tells us that B? — t is a martingale. 

Extending this to a more general case gives us the following result (with the same idea of “correction for drift” and 


checking that the higher-order terms do not affect our integration): 


Theorem 61 (Ito's lemma) 


For a function f and a Brownian motion B;, we have 


il 
df (B:) = f'(B:)dB: + af" (Br)dt. 


For example, this tells us that 


But now let’s connect this back to the work we've been doing in the class: 


Example 62 

Last time, we discussed the SLE process, where we produce a random curve by producing a random function g; 
which maps H \ 7([0, t]) > H and acts like the identity near oo. (Recall that the image of the tip of the curve 
W, then maps to some point on the real line, and because W; = gz(n(t)), this helps us reconstruct g¢ and then 


n(t) just from W;.) Specifically, the equation that governs our random function g¢ Is 


2 
gr(z) — Wt 


dt, dW,=VJ/KdB:. 


dgr(Z) = 


This whole setup should seem bizarre: when we try to construct a random curve, we usually want an equation 
in terms of dW; (where the tip of the curve looks at time t). But in this case, our calculus is being done on the 
normalizing map gz, and once we know W;, the differential equation has no stochastic calculus built into it. 

Again reviewing the material from last time, we also have a variant f of g where instead of having it look like the 
identity near oo, we make the tip go to 0 and have the function look like a translation of the identity near co. We 


then write f-(Z) = 9:(z) — W;, and therefore 


df,(z) = dg:(z) — dW; = dt — VK dBy. 


2 
f(z) 


If we now plug in a real value for z and view this as a real-valued process X; = f,(Z), we have the stochastic differential 


equation P 
dX; = —-dt — /K dB. 
Xt 
This means the fluctuations in X; are constant, and the drift term xdt pulls us away from the origin. What we have 


here is a Bessel process, and it’s connected to the chi-square random variable (which is the sum of n copies of a 


normal random variable): if we add n copies of a one-dimensional Brownian motion and add up their squares, that’s 
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like looking at the squared distance of an n-dimensional Brownian motion, and we can do Ito calculus on something 


like that. If we look at the distance of the Brownian motion from the origin for n = 2, we have 


1 1 
d(y Bi, + 82.) = (dB, ++ dBo1)4 dt 
2,\/Bi,+ BS, \/ Bi + BS 


(here, the second derivative dt term is just a Laplacian of the distance from the origin). In other words, there's an 
expected drift outward because we have one term pushing us radially and also lots of orthogonal movement (which all 
increases our distance from the origin). The distance of an n-dimensional Brownian motion is then represented by a 
Bessel process, and studying that tells us something about the chance of an n-dimensional Brownian motion returning 
to the origin (it only happens when n = 1). 

Connecting this back to the SLE curve, the g; function causes points on the real line to evolve as a Bessel process, 
and we can ask ourselves when g;(Z) (for some real number z) coincides with the image of the tip of the curve 
under g;. This basically happens only in a limiting case where the SLE path hits the boundary, and in fact based on 
facts about Bessel processes, that will only occur when & > 4. So this approach is one way we can work out the phase 
transition of SLE at kK = 4. 


Fact 63 


The other phase transition (where we start having space-filling curves at & = 8) can also be understood via Bessel 


processes, but the question then becomes one of “what order points on the real axis are swallowed” and thus the 


analysis is a bit more subtle. 


We'll now return to more Ito calculations, specifically those that correspond to the Gaussian free field. Going back 
to our function f(z), we can find (for a fixed z, and now using the more general formula df(Xz) = f’(X:)dX: + 


PUK) (x, X)+dt with the quadratic variation term coming from X) 


1 


dlog f(z) = A(z) 


df,(Z) (fe, fe) dt. 


1 
2h,(z)? 


Since f; has a \/K term in its differential equation, the quadratic variation has a & factor, and we're left with 


242 en (Cet. Pe 
Aig 2) = ea; (gett - VRB) - aaap t= aaa Faye 


And magically, if we're in the special case & = 4, the drift term goes away, and thus log f(z) must actually be evolving 


as a (local) martingale, meaning the (complex) argument must evolve as a martingale as well. 

But now we can connect back to previous work: if we have our SLE curve evolving upward in the half-plane as 
before, now let’s say that we have boundary conditions of a on the left side of the curve and 0 on the right side. To 
understand the harmonic extension of those boundary conditions, by using our conformal map to map the tip of the 
curve down to 0 — we find that everything left of the origin on the real line has boundary condition 7, while everything 


right has boundary 0. So the argument function argf;(z) must be a martingale. 


Fact 64 


A more intuitive way to understand this is as follows: let z be some point in the upper half-plane. Then once the 


SLE curve finishes being drawn, if z is to the left of the curve, then f(z) will be mapped to the negative real line, 


and if z is to the right of the curve, it will be mapped to the positive real line. So the probability that z passes by 


arg(z) 


onthe lent side for an SLE, cue is = 
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We can next consider the function f/, which is the derivative of f with respect to z, and apply Ito’s lemma to it as 
well. Note that we now have to be careful with the order in which we take our derivatives because there are Brownian 
motions in t, but we want to argue that it’s okay to swap the order so that 


7 


d 
= oe 
dfl = = (df(z)) apt 


Noticing that there is no diffusion term in this expression, this means that even though the value of f fluctuates back 


and forth, the derivative evolves in a differentiable way. We can then get the nicer expression 


2 
dlog(f/(z)) = -—~-dt, 
g( + ( )) f,(z)? 
and now we can use this expression to make claims about the argument of f/(z) just like we did for f(z) — what this 
basically tells us about is the local rotation of points near the curve 7([0, t]) under f, and that essentially tells us 
how much the curve itself has been winding clockwise or counterclockwise. So if we turn back to the differential 


equation, what we're being given is a way to characterize the rotation in terms of the evolution of the curve! 


11 October 14, 2021 


Starting today, we'll discuss Liouville quantum gravity and imaginary geometry, introducing them in the context of 
coupling the GFF with SLE in some way. (To do this, we'll need a lot of playing around with the objects that we've 


defined earlier in the class, but we’ll start with a variety of topics.) 


Fact 65 


Professor Sheffield has four papers on “imaginary geometry” with Jason Miller, which we can read about on arXiv. 


But today and in the next few lectures, we'll focus on the paper that we can find at https: //arxiv.org/pdf/1012. 
4797.pdf. 


One object we'll be discussing is the Bessel process given by the stochastic differential equation (SDE) 


n-ldt 
dX, = dZ, + —— —, 
t t 0 xX 
where Z; is a one-dimensional Brownian motion. So much like in Ito's lemma, we have a drift term uzdt = 5x dt, 


and we have a diffusion term o,dB; = dZ;, and what this kind of expression really means is that 


t E 
x:= | eds + | By, 
0 0 


where the first integral is an ordinary integral and the second is the stochastic integral we previously defined. (For 
a more rigorous mathematical derivation, we can see Oksednal’s book “Stochastic differential equations.”) But the 
reason these differential equations can be difficult to solve is that o, and w, can be in terms of s and X,, and that’s 


indeed what's happening in the SDE above. 
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Fact 66 


A related concept we'll be making use of is conformal invariance — we learned in complex analysis that we 


can always find a conformal map from a simply connected strict subset of C to the unit disk, but that kind 
of argument works for general Riemannian manifolds as well (except if we have higher genus surfaces, we end 
up with hyperbolic space). We can see some visualizations at https: //www3.cs.stonybrook.edu/~ gu/gallery/ 


RiemannUniformization/index.html. 


Basically, there is a universal cover of the torus (the plane tiled by squares), and in two dimensions a Brownian 
motion on this universal cover will come back to the starting square infinitely often. (So we never really reach a 
boundary like we're able to do in the simpler cases.) Hyperbolic space can then be conformally related to Euclidean 
space by writing down a metric of the form eX) (dx? + dy), and more generally we can have a different function 
coefficient for each of the dx2, dy?, and dxdy terms. And that that tells us is that defining a random conformal 


equivalence is basically encoded in a random function. 


Fact 67 


Imaginary geometry considers the following situation: given a real-valued function h(x, y), the function e!” (which 


is unit-circle valued and can be represented by an arrow in some angle) gives us a complex vector flow. 


In the real world, if we're on a hike, two objects we might carry are an altimeter (which tells us our height modulo 
some number with a needle pointed in some direction), and a compass (which tells us the direction we're facing). If we 
decide we're in a situation where we want to make our compass and altimeter always point in the same direction, 
we can imagine that the landscape we are traversing has height h(x, y), and we follow a path along those vector flow 
lines e!” (or more generally e!("*©) for some constant c). The lines that we follow in this process can then be thought 
of as “altimeter-compass geometry,” but this name was later changed to “imaginary geometry” because it was easier 
to explain. 

If we look at the metric eh %Y) (dx? + dy”) from above, notice that if o is harmonic, then the metric is a flat metric. 
What this means is that if we have a map @ between two domains, where we have the regular Euclidean metric dz 
|? 


on the initial space, we'll end up with a metric depending on |@’|* in the traget space (because that’s the factor by 


which we stretch space). Then log |@’|? is harmonic, because @ being analytic implies that log |@| is analytic, and then 
we use the fact that log|¢’| = Relog @¢’ and the real and imaginary parts of an analytic function are harmonic. More 
generally, the Laplacian of ¢ will generally relate to the Gaussian curvature of our manifold. 

Motivated by this, we see that defining a random surface then depends on us defining this random function p, and 
the Gaussian free field is a natural choice given what we've talked about so far, even though it’s not really a function 
and we need to explain why it makes sense to take the exponential of a Gaussian free field “e"(2)dz.” We can start 
by trying to write down 


eZ) qz = lim eM) dz 
e>0 : 


where h, is the mean-value of hf on the boundary of the ball of radius € centered around z (which we've shown is 
well-defined). But we can also observe that because h, is distributed like a Brownian motion as long as € shrinks 
exponentially, so 


2 
B[eM(2)] = eVarlrhe(2))/2 — exp (=) a ee eS 
: 2 


(here we use the fact that Ee?” 


= e”/? for a standard normal N). So we need to do some additional rescaling to 
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avoid having the expectation diverge, and we'll instead have 


eMZ) dz = lim eV /2eM@ dz |. 
e>0 


We still have to check whether this limit will actually exist, and in the literature this is known as “Gaussian multiplicative 


chaos” (we can look up Kahane’s work for this). 


Fact 68 
Mandelbrot actually motivated some of this work because of the concept of multiplicative cascades, where we 


(for instance) start with a function on a square, cut that square into four smaller copies, and on each square we 


either double or halve the value of the function (with probability 3 and Z so the expectation is the same). 


This process turns out to have a random limiting measure, and it can be shown that the support of this random 
measure has some fractional dimension (because for a typical point the value goes to 0, but there are always some 
exceptional points for which the value is not). This is morally similar to the Gaussian free field — if we said that instead 
of multiplying by 2 or 5, we multiply by the exponential of a standard normal, then the covariance structure between 
two points is determined by how far along the “branching” process the two points agree at (how many common levels 
where their squares agree). Indeed, if we have Gaussians to multiply by each time instead of just 2 or 5 then the 
covariance depends linearly on the number of branches, and this is kind of like doing a Brownian motion with respect 
to the log of the size of the square. But the difference is that there are points which are very close in the Euclidean 
sense but may have very little correlation in the multiplicative chaos model (if they are on the edge of different squares 
near the beginning). 

We can now come back to the story of imaginary geometry — we're interested in the two objects e” (from the 
Gaussian free field) and e!”/* (from our vector flow), and if we interpret both as random surfaces we get the Liouville 


quantum gravity surface and imaginary geometries, respectively. 


Fact 69 
The sum of the angles of a triangle in hyperbolic space is not always m — instead, that value depends on the 


integral of the Gaussian curvature. But in our imaginary geometry setting, the direction of our straight lines just 


depends on the value of c in e("*°). So if we follow a triangle in our imaginary geometry space, we'll still have 


the same situation where we must increment c by 27 to finish a triangle, and thus the sum of the angles will still 


be 7. 


This means that a parallel transport process (sliding an object around a cycle without slipping) will not change 
the angle of the object, but it may change the size. In contrast, on a surface like a globe, the size doesn't change 
but the angle might. Essentially, our “altimeter” fA will tells us the amount of winding we've taken in a path along 
our altimeter-compass geometry, but now that’s starting to sound like some concepts we previously talked about with 
SLE! Our walk on a hexagonal lattice from lecture 1 had us turn left or right based on the value of the GFF, but if we 
also add in a factor corresponding to the winding, it turns out we'll get SLE curves of different values of &. So that’s 


where the coupling is going to come into play, and we'll discuss this more next time. 
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12 October 19, 2021 


Remark 70. The “tricky part” of our problem set (finding a closed set where the expected number of loops in a loop- 
soup hitting it is positive and finite) is an example of a problem where writing up the proof is probably not publishable 
on its own because it’s not surprising enough, but if we wanted to use the result we'd have to write it up clearly. The 
answer is basically no and the idea is that most of the expectation comes from “small loops,” but if we want to write 


this up carefully as a final project we can do that. 


Fact 71 


Professor Sheffield sent an email over with some interesting thoughts about the recent xkcd comic https: //xkcd. 


com/2529/, which we should take a look at if we're curious. 


There's already some work in interpreting the “weirdly concrete’ question — for example, when we “walk randomly on 
a grid, never visiting any square twice,” there's a lot of different ways we can do that conditioning argument. For 
instance, we can uniformly pick a walk of length N that doesn’t intersect itself, and look at the induced measure on the 
first N steps of the walk as N — oo, which does yield a limit at least along subsequences by compactness. Then doing 
a Cantor diagonalization argument by increasing N eventually lets us construct an infinite self-avoiding walk (SAW). 

It is then natural to ask about things like the dimension of this curve in the limit — for example, the number of 
spots visited within a ball of radius N for the self-avoiding walk turns out to be proportional to N*/3, and in fact 
that makes it very related to the SLE curve with parameter 3 (recall this is the curve that we get if we do Brownian 
motion on a disk stopped at the boundary and then consider the boundary of that Brownian motion). However, we do 
not yet know about the value of a for the corresponding self-avoiding walk in three dimensions — based on numerical 
simulation, it doesn't appear to be a nice rational (or other) number. And when we have very large dimension d, the 
(ordinary) simple random walk does not hit itself very often, so the analysis can be simplified much more there. 

We'll now turn back to the current chapter of our class (imaginary geometry) — last time, we started talking about 
the measure e7'(2) dz, defined as 


i 2 
lim e /2eVe(2) dz, 
e>0 


where the limit is taken in the weak sense. A good situation to consider as we talk about this is to define a Gaussian 
free field only on the upper half-plane instead of the whole plane — recall that the GFF was defined on a domain D 
by looking at the Hilbert space closure H(D) of compactly supported test functions under the Dirichlet inner product, 
but we can now change our setting by allowing for bump functions that don't necessarily go to zero on the boundary 
(for example, perhaps we allow bump functions that are symmetric under reflection around the real axis, so when we 
restrict to the upper half-plane we have free boundary conditions instead of Dirichlet boundary conditions). Since 
we can always decompose any function into an even and an odd part, we can decompose functions on C into those 
that are symmetric and antisymmetric under reflection over the real part, and they’ll have free and Dirichlet boundary 
conditions, respectively. More formally, that means we have the Hilbert space H(C) = H.(C) @ H,(C), and we can 


always project our GFF onto one of these two pieces. 


Fact 72 
One issue that we run into when we use the complex plane instead of an ordinary domain D, though, is that we are 


only able to define the field up to an additive constant. One way to solve this is to only define the inner product 


(h, p) if [ p(z)dz = 0, because then adding a constant to h does not do anything to the inner product. 


32 


The reason such a setting is good is as follows: we know (from our problem set) that if we fix the Gaussian free 
field outside a large square grid of size N, then the value at the middle vertex scales as log N, and so it's hard to pin 
down anything as N — co. But if we just make sure the average value of p is zero, that calms things down a lot. 

Based on this analysis of other domains, if we want to define a random surface, we don't want to have to define 
it modulo a scaling factor (since eY” gets multiplied by e% if we add c to h). One thing we can do is to fix that 
additive constant, but doing something like subtracting off the average value of the field along some circle of positive 
radius depends on the circle that we considering and is not super natural. Instead, we'll construct a quantum cone or 
quantum wedge, which works as follows. Notice that if we rescale our domain by a factor of R, eY” becomes we, 
so we have to adjust our height function as h = h— = log R to compensate. But we also have an e7/2 factor in the 
definition of our measure, and the é€-ball also gets rescaled when we rescale our domain, so we also need to include an 
RY/2 factor in our definition of eY"2)dz. At the end of the day, the factor that our height function needs to be 


changed by when we have a domain D and map it to a domain D using a conformal map w is 


ws 2 
h=hop+Qlog|w', 224, 


Then (D, h) and (D, h) are equivalent and we can define an equivalence class, and this gets us to our definition of a 
random surface which does not depend on the set used for parameterization. 

Thinking more about these quantum wedges and another way to arrive at the boundary conditions: recall that we 
can always define our GFF on C or the upper half-plane, and then we can conformally map it to a wedge R x [—1, 1]/ or 
infinite cylinder Rx S. Instead of h, we can consider h(z)+ clog z as our height function (so that we have a singularity 
near z = 0) on the upper half-plane. In this setting, if we define our GFF on the upper half-plane and then conformally 
map the resulting object we get onto the wedge, we will get a linearly increasing function from left-to-right. We 
can now decompose our new GFF by looking at average values of h and subtracting that off (corresponding to radial 
averages in the original GFF), and then the rest is linearly orthogonal to this one-dimensional component. Specifically, 


because we know that circular averages give us Brownian motions, we can then imagine that we have 
h(t, 0) = By + h(t, 6) 


where fA is mean zero for a fixed t. So it might seem like we want to remove the additive constant here, but we 
don’t want to (for example) pick out a particular point and constrain it to take on some value. (For example, we can 
think about the local time for the Brownian motion, which tells us about the “amount of time” that we'll spend at 
the line near a particular point, and that's not scale-invariant in the way that we want unless we do some additional 
conditioning. ) 

Either way, this means we can construct a random scale-invariant surface on the upper-half-plane (called the 
quantum wedge) or on R x S (called the quantum cone). And this naming is because the clog(z) term gives us 
linear growth when we look at how it affects the term e7""4). Once we have this quantum wedge, we have a notion of 
boundary and boundary length, so we can take two quantum wedges and glue their boundaries together in a length- 
preserving manner. The claim is then that a conformal map of the resulting surface onto the upper half-plane gives 
us a glued boundary that maps to an SLE curve. (This is connected to the process of conformal welding — there's 
a canonical way to get a conformal structure on a glued surface from the conformal structure of the two individual 


surfaces.) But we'll be analyzing this more in the future! 
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13. October 21, 2021 


We'll start by discussing Green's functions on C and H for the continuous GFF today — recall that we've already talked 


about this recently through the identity 


Cov((h. 01), (h. p2)) = i. G(x, y)ou(x)paly)dxdy 


for all test functions (1, 02 (this is one way that we can define the Gaussian free field because it is a centered Gaussian 
process). Specifically, we mentioned that G(x, -) is a function that blows up logarithmically around x in two dimensions, 


so for a bounded domain D we have 
G(x, y) = —log |x — y| — (harmonic extension of — log|x — y| as a function of y from OD to D). 


Last class, we mentioned that we'd like to make sense of the covariance relation in the case where (1, 02 are Mean-zero 
functions when D is unbounded. If we imagine having a large domain D, then the harmonic extension of — log |x — y| 
will be approximately constant. If 01, 02 are mean-zero test functions, then adding a constant to G doesn't change 
Jp G(x, y)e1(x) p2(y)dxdy, so it makes sense to basically think that we can take a limit and get the Green’s function 


on the whole plane 


Gc(x, y) = — log |x — yI. 


Remark 73. Using this Green's function in an arbitrary number of dimensions yields the log-correlated Gaussian 
field, and it has the nice property that we can project a log-correlated Gaussian field from higher dimension to lower 
dimension. And using a Green's function of G(x, y) = |x — y|© for some c gives us the fractional Gaussian field, 
which (as we mentioned in a past class) basically comes out of taking a fractional Laplacian and applying it to white 
noise (which is like multiplying by (x* + y) in Fourier space). An example of a fractional Gaussian free field (for a 
particular value of c) is Levy’s Brownian motion, which is a (higher-dimensional) process which looks like a Brownian 


motion when restricted to any line. 


Returning to our GFF distribution h (on the upper-half plane HI), though, we can take the idea from last class and 


define the projections F i 


v2 v2 


If we then plug in a test function p, we have (by a change of variables) 


h° = —=(h(z) — A(Z)), AE = —a(h(z) + A(2)). 


1 1 
h°,p)=—=(p—p*), (AF, p) = (hp te), 
(AY, p) pe de Aiese) aie + 0°) 
where p*(z) = p(Z). This means that we can compute the covariance function for H? to find (after some computation) 


Cov((h°, pr), (A°, po) = / pr(x)G#(x, yoaly)dxdy 


where GHo(x, y) = log |X — y| — log|x — y| (the idea is that for the harmonic extension of —log|x — y|, we can use 
—log |x — y|, because a real number y is always the same distance from x and X but log |x — y| is harmonic everywhere 


in Hl). Similarly, we have 
Cov((HE, pr). (HE, pa)) = f palx)G**(x, y)oalvaray, 


where GHE(x, y) = —log|x — y| — log|x — y| (this time the idea is that when we have free or Neumann boundary 
conditions, the normal derivative will be zero). 


We'll now turn back to the SLE curves and their stochastic differential equations: recall that SLE curves 7 are 
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characterized by a conformal map f(z) : H \ n((0, t]) ~ H which maps the tip of our curve at time t to 0 (we can 
think of this as a cutting a line in a piece of paper, then flattening it out so that the cut-out line lies flat where the 
boundary of the paper was originally). But we can also perform this process in reverse, mapping H — H \ »((0, t]), 
and that gives us the reverse flow SLE. Since all that happens when we reverse time is that the drift term is negated, 


the equations that govern the forward and reverse flow SLE become 


df-(z) = a Sede, 62 “3 ROB: 


respectively. (This idea of cutting along a path or creating one leads to the concept of a “zipper.”) Ito’s formula now 
gives us a sign difference in a few steps for the expressions of dlog f(z), df{(z), and dlog f/(z) — one place where 
this sign difference is more noticeable is that because we picked the Brownian motions of the forward and reverse flow 


SLEs to point in the same direction, we have 


+4—-—kK VK 
dlog f(z) = TAGE dt Atay 


(with + corresponding to forward and — to reverse). We also have df/(z) = Hi dt (+ instead of —) and dlog f/(z) = 


aapat (also + instead of —). In particular, notice that we cannot make the drift term of dlog f(z) go away for the 
reverse flow like we did for the forward flow, so we won't get a martingale. But we can take a linear combination of 


dlog f(z) and dlog f/(z) to get a martingale, which we'll show now. If we define 


ar 


then we can define 


2 2 
dot (z) = 7 log f(z) —xlog f(z), dbt(z) = Ts log f(z) + Qlog f(z) 
in the forward and reverse cases, respectively. These turn out to be the linear combinations that make the drift term 


go away, and in fact they also make the \/« factor in the Brownian motion term vanish as well (since we end up with 


dhi(z) = +775 4B:). We can then take the imaginary part of the forward flow SLE and the real part of the reverse 
flow SLE to get martingales h;(z) which satisfy 


dB:, db(z) =Re dB, 


2 
dhz(z) = Im F 2) 


t(Z) 
respectively. 


Remark 74. Recall that the reason for taking the imaginary part for the forward flow SLE was to ensure our local 
martingale is actually a true martingale — we want to avoid something like B_1/, as t + 07, which has zero drift but 
does not satisfy the optional stopping theorem. This is essentially like “running Brownian motion starting from 1 until 
it hits 0.” Bounded local martingales are always martingales, and so are local martingales which can only change by at 
most some total amount. In particular, if we reparameterize a local martingale’s time based on its quadratic variation, 


it will become a martingale. 


If we think more about what Im Fa IBt actually looks like, note that the function Im(4) has level sets which 
are given by circles tangent to the real line at the origin (the function is zero on the real line and blows up as we 
approach 0 from the imaginary axis). So the martingale we've just defined has to do with pulling those circles back 
from H to H \ (0, t]) to get a function that blows up near 7(t), and every time we do a bit of exploration for the 
curve, we add or subtract a small multiple of that pulled-back function. (In particular, there is lots of fluctuation near 


the tip for h:(z), which makes sense because that’s where the probabilities of ending up on the left or right of the 
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curve for a point z near the tip can drastically change). It also makes sense that the fluctuation here is zero on the 
boundary (since the curve 7([0, t]) is mapped to the real axis). 


This now motivates us to define 


C,(z) = £loglm f(z) — Relog ff (z). 


At time t = 0, f(z) = z and this function is just +logIm(z), so for the forward flow it’s the distance of the point z to 
H. More generally, it turns out that C;(z) is the log of the conformal radius of H\n([0, t]) (basically in corresponding 
points in domains have the same conformal radius if the derivative of the map @ at 1, and the definition agrees with 
the usual radius for circles) — a cool fact is that the Koebe j theorem tells us that the conformal radius and inradius 


are the same up to a factor of 4. And this C;(z) also satisfies the identity 


d(be(Z), be(z)) = —dC;(z). 


14 October 26, 2021 


Last lecture, we defined martingales h; out of our SLE curves, specifically writing out stochastic differential equations 
in terms of the function f(z) = g¢(z) — Wt (which is a map from H \ 7([0, t]) - H which always maps the tip to 
the origin). We mentioned that there is both a forward flow SLE and a reverse flow SLE, and they give rise to slightly 


different equations — let’s review how we arrived at them. Recall that we start with the forward flow SLE equation 


df,(z) = ~ VK dB, 


a 


and then we can “reverse the vector flow equations” (so that we're now mapping H — H \ 7((0, t]) in a time-reversed 


way) to get the reverse flow SLE equation 


df,(z) = —~-~dt — /KaB,. 


2 
f(z) 
(Since Brownian motion also arises from the limit of a simple random walk, it’s also possible to understand this evolution 
as tossing a coin and having the curve move slightly to the left or right at each step, then applying the corresponding 
conformal map.) We can then use Ito’s formula, di(z) = uedt+odB, = > dF(ft(z)) = F'(R(z))dh(z) + 
$F"(f:(z))o?dt, to derive (+ for forward, — for backward) 


+4—-—k VK 


dlog f(z) = TAGE dt 7) dB. 
We can then also compute 
2f/(Z) 2 
dfi(z)=+—=—-“— dt dlog f(z) = =#-——~dt 
+ (Z) f,(z)? = og t(Z) f,(z)2 
(since we can commute differentiation with respect to t and z). We then defined the constants x = 2 LE fap 


forward flow and Q = x + ve for reverse flow, with the idea that we can form the linear combinations 


ni(z) = a log f(z) —xlog f(z), 6¢(z) = — log f(z) + Qlog f(z). 


These processes are local martingales because dhi(z) = dB, has zero drift term dt. Finally, we defined our 


F(z) 710) 
martingales 


br=Imb:, br = Reby. 
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We're now caught up and will use a few more ideas from stochastic calculus: 


Definition 75 


Recall that the quadratic variation of a stochastic process is given by 


k 
(X,X)-= lim (Xi, — Xu), 
ill 


mesh—0 


— 


where (0 = ty, to,--: , t, = t) is a partition of the time-interval [0, t]. (This limit is in the sense of convergence 


in probability. ) 


The idea here is that over a time €, Brownian motion fluctuates on the order of \/é, so squaring that fluctuation 
should give us (B, B); = t, while something like a smooth function fluctuates much less and should have quadratic 
variation zero. But to actually show that (B, B); = t, we need to show that the error in terms like B2 — € (when 
our partition has a segment of length €) go away as € — 0, and the way we do this is by showing the expectation of 
(© B2 - e)* goes to 0 (this can be shown because the fourth moment of a Gaussian random variable is finite). But 
what's nice with this definition of the quadratic variation is that it’s well-defined even under time-changes like Bo, 
for an increasing function o. 


For our purposes, though, if we have df;(z) = urdt + o¢dB;, then our quadratic variation will just be f ofdt. 


Definition 76 


A more general version of the quadratic variation is the covariation 


k 


(XV)e= lim SOX — Xtina — Your): 


mesh—>0 


i=1 


However, we can use the usual Hilbert space identity to write 


(X%Y)t = 5 ((X FY X+Y)—(X%,X)-MY)), 


sO we can compute covariation as long as we have well-defined quadratic variations for X,Y, X + Y. And with this 
in mind, we can return to our SLE equations and keep doing calculations: we've mentioned previously that because 


— log |x — y| is the two-dimensional Green’s function on C, we have (for the GFF) 


(ho). (h.60)) = ff Golxydeu(x)e2ly)dvaxay, 


£, 


as long as p; and po are test functions with average 0 (this is a distribution modulo additive constants). Then 
we can have the free-boundary and fixed-boundary conditions on H given by Green’s functions where we take linear 


combinations of Gc(x, y) and Gc(x, Y): for the forward flow SLE, we'll use the Green's function 


G(y,z) =log|y —2|—logly — z| 


(fixed boundary free field), and for the reverse flow SLE, we'll use 


G(y,z) =—log|ly —2| — logly — z|. 


This time, we actually want a free boundary free field because of the “quantum gravity zipper’ — the idea is that 


using e7” defines a measure on H, then e%"/? defines a measure on the boundary (the real axis). But if A is only 
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defined modulo a constant, we only know this measure modulo a scaling. If we now mark the real line with unit length 
increments, then the relative lengths on the two sides of the origin do not change when we add a constant to h. If 
we now connect the right and left sides of the real line with a pairing (based on distance from the origin), we want to 
try gluing the left and right side of the real axis together like a zipper: then those two sides will be the “left” and 


“right” sides of an upward-moving path in our new random surface, and that’s what our quantum zipper does. 


Fact 77 


If we contrast this random welding of Riemannian geometry e”” with “following the arrows” in our imaginary 


geometry vector field created e/"/X, we have two completely different ways of arriving at a path on our surface. 


But it turns out these two stories are related — the basic calculations are similar enough that one is essentially 


governed by the forward flow and one by the reverse flow equations. 


Turning back to our calculations, we'll now define 
Gt(y,Z) = G(A(y), 4(Z)) 


for both the forward and reverse flow SLE. It turns out that G is a decreasing function in t — we can Justify this because 
adding more of the curve 7 creates more of an obstruction, so interpreting G as “the time spent around z if we have 
a Brownian motion started at y and stopped at the boundary,” larger t means we'll have fewer visits in expectation. 


From here, we can compute that 


dfity) - dh(z) _ (ney — ee) at 
A=) EMEA) 


(because this has no dB; term, we don’t need to worry about the additional Ito correction), and doing the same thing 


dlog(f:(y) — f:(Z)) = 


with the other log term of the Green's function yields (after some manipulation of complex numbers) 


2 2 
dGi(y,z) =—Im Im dt 
yz) AOMEO 
in the forward-flow case, and similarly 
2 2 
dGi(y,z)=—R 


"ay aa)” 


in the reverse-flow case. It turns out negating this expression gets us exactly to the quadratic covariation: 


d(br(y), 64(Z)) = —dGt(y, 2). 


(If we think of (X, X)+¢ as morally i. o2ds, then we can think of (X,Y) as morally i 0;0;ds, which tells us the local 


correlation of the two diffusive parts. Since we previously computed dh; = +Im Aa dB. this should be believable. ) 
The next thing we'll try to compute is the quadratic variation of (hz, 0), which we can think of as an integral over 
he: tt turns out that because we can add cross-contributions from diffusive terms when we're computing the quadratic 


variation for a finite sum, the same holds for integrals, and end up finding that 


d(be. 0), (He 6) = — [ ply)Ge(y. z)p(z)dydz = —dE:(p). 


This is connected to the “energy of assembly” for the system which we've discussed previously, and it’s basically a 
conditional variance of the Gaussian free field on a test function p. Since G; is decreasing with t, but §; is a martingale, 


we can imagine that we start at time —Eo(p) and parameterize time as —F;(o) (because F; is decreasing). Then the 
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variance relation we mentioned before tells us that (—E:(p), (bt, 0)) traces out a Brownian motion as t ranges from 
0 to the stopping time 7, and then we just need to run the Brownian motion up from —Ey(p) until time 0. 

That then tells us that the final value of (A, p) is a centered Gaussian with variance Eo(p), and here’s another way 
to say that: we can either (1) take our upper half plane and choose the GFF 4h, or (2) draw a path 7 under SLE for a 
while, take the h; we get at that time, and then add a Gaussian free field A in the remaining unexplored space. And 
it turns out that these give us the same object, because we just need to check that this is true on each test function 


p, which is what we've been doing above. 


15 October 28, 2021 


Last time, we did some calculations involving the martingale h;(z) for each fixed z (which is a linear combination of 
the imaginary part of log f and log ff in the forward flow case, and a linear combination of the real parts of those 
functions in the reverse flow case). Specifically, we considered the quadratic variation d((hz, 0), (bz, e)) and found that 
it was equal to a double integral — i, J, Gly, Z)e(y)e(z)dydz, which is what we defined to be —dE;(p). 

We'll now gradually work towards understanding the significance of this result. Here’s the main result that we 


proved with our work last time but did not actually state explicitly: 


Theorem 78 
Fix some & € (0,4], and consider the SLE, segment 77 up to time 7 generated by the equation df,(z) = 
ADL — /K dB;, fo(z) =z. Define ho(z) = —zarg(Z) and x = 73 — = and then define 

be(Z) = bo(fe(Z)) — xarg f(z), 


with details described in Remark 79. Now let hf be an independent zero-boundary GFF on H. Then the distributions 


h=bot+h, hofr—xargf =b7 +ho fr 


agree in law. 


Remark 79. Notice that because ho is harmonic and f; is a conformal map, ho(f(Z)) is harmonic and thus specified 
by its boundary conditions — those conditions are that we have 0 to the right of ny and —2 to the left of it. Similarly, 
because f(z) must be real when z is real, x arg f/(z) is zero on the real line, and on the remainder of the boundary 
(namely the curve »), arg(f{(z)) takes on values that depend on the amount of winding of the curve (and on the 
left and right sides of the curve, the gap is always 7). Since x arg f{(Z) is harmonic, this second term is a “harmonic 
extension of the amount of winding of n to the rest of the domain,” and we ask it to be continuous on H \ nr and 
tending to 0 at oo to avoid ambiguity with the definition of arg. 

We'll notice that there's two different height gaps here from the two terms of hi(z), —xm and a, and thus adding 


them together yields an overall gap between the left and right sides of the curve for hz(z). 


Here, our first object h is basically a GFF with boundary conditions —24 on the negative real axis and 0 on the 


positive real axis, while our second object takes the more complicated harmonic function §7 that we just described and 
adds to it a GFF ho fr on the remaining domain HI \ n7 (since the GFF is conformally invariant and fr is a conformal 
map); more specifically, it’s the pullback of h from the first object. And what we're saying is that defining the free 
field with the positive /negative real axis boundary conditions is the same as evolving our SLE curve for a bit 


and then working with the more complicated harmonic functions. 
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In the special case where x is 0 and we have a level set of the Gaussian free field, it makes sense (from our earlier 
discussion) that we can either observe the Gaussian free field all at once, or on a level set and then with additional 
conditioning (this is like the Markov property on the hexagons in the discrete case we described earlier on in class). 
So this tells us that everything behaves as it should in the limit, if our level set is an SLEg. 

We claim that we've actually proved this theorem already, and it follows defining a Gaussian free field relies on 
looking at pairings that look like (A, p) — specifically, we know that if this expression is normal with mean 0 and variance 
fey G(x, y)e(x)p(y)dxdy, then h is already the Gaussian free field. (Remember that this is surprisingly powerful: we 
just have to test one test function at a time, which already tells us that linear combinations of these test functions are 
also Gaussian, so this tells us the characteristic function of h and thus does require h to be a Gaussian random field. 


From there, we can get covariances in terms of the polarization identity.) 


Now because hy is a martingale, the expectations E[(ho + h, p)] and E(hr + ho fr, p)] are both equal to (ho, p), 


so the means line up and we just need to check the variances. And that's the point we made at the very end of last 
class: if we imagine a Brownian motion that runs from time 0 to 10, we can either obtain By directly (as a normal 
random variable with mean O and variance 10) or obtain By, first and then sample Bio as a normal random variable 
with mean B, and variance 9. More generally, if s is a stopping time that is at most 10 almost surely, then Big can 
be obtained by getting B; and then sampling a random variable with mean B, and variance 10 — s. The key insight Is 
then that (hz, p) is evolving like a martingale, so it’s evolving like a time-changing Brownian motion (where the notion 
of time is given by —E;(p)). 

So here's the important argument: if we imagine a Brownian motion started at time —E,(p), run until some 
—Er+y(p), then the fluctuation in the Brownian motion will be the same as ((h¢, 2), (hz, 9)). And adding the E;(p) term 
at the last step is basically like adding the mean Bs in our ordinary Brownian motion — adding this term is equivalent 
to continuing a Brownian motion for time E;(p), so it’s like we just ran a Brownian motion from time —Eo(p) to time 
0, regardless of T. 


Next, we'll look at another result, which is the corresponding theorem for reverse coupling: 


Theorem 80 
Fix some & > 0, and consider the SLE, segment up to time 7 generated by the reverse Loewner flow df-(z) = 
—Ftydt — /K dB;, fo(z) = z. Define ho(z) = 73 log|z| and Q= 73 + = and then define 


be(Z) = bo(fe(Z)) + Qlog |f,(z)]. 
Let h be an independent zero-boundary GFF on H. Then the distributions 
h=both, hof-+Qlog|ft| =br +ho fr 


agree in law. 


This time, remember that ff is the “zipping-up map,” where the Q log |f’| factor comes from a change of coordinates 
— it’s how we preserve measures when parameterizing Liouville quantum gravity surfaces with different sets. So this 
theorem tells us how to define a random surface modulo a constant in h (so a surface modulo scaling of the metric 
ev), and it says we can do this in two ways: either draw ho and add a Gaussian free field to that, or we can generate 
a random surface on H\ 7 and then map it back via coordinate change (in other words, zip up part of the real line 
along an SLE curve, find h on the zipped-up domain, and then change back). So that means that if we generate a 
random surface with this process (it’s essentially the scaling limit of gluing together many quadrilaterals in a discrete 


random surface), then randomly cutting the surface along SLE does not change the law. 


40 


Fact 81 
The results we've just described are Theorem 1.1 and Theorem 1.2 in https://arxiv.org/pdf/1012.4797.pdf, and 


we can also understand this concept by thinking about a scaling limit of a discrete graph, considering a spanning 


tree plus a chord that connects two of the boundary points a, b, and then sending a to O and b to coo when we 
map this discrete graph to the upper half-plane using the Riemann mapping theorem. (The chord becomes SL Eo, 
and in fact it looks like SLE» regardless of the surface on which it's drawn.) But we should see Figure 2.4 for 


more details. 


16 November 2, 2021 


We'll discuss random planar maps today: 


Definition 82 
A planar map is a graph that comes with an embedding into the plane, such that topological differences matter 
and such that edges aren't allowed to cross each other. Specifically, two embeddings are equivalent if there is a 


diffeomorphism between them, so for each vertex we essentially need to know a cyclic order of its edges. 


We can check that we can work out all of the polygonal faces by exploring vertices one by one. Mathematicians 
who study planar maps often study random planar maps, such as a random triangulation, in which we take WN triangles 
(each with a clockwise orientation) and randomly glue the triangles together to get a topological surface. However, 
the resulting shape is not necessarily simply connected (for example, we can form a torus by gluing together enough 


edges), so we often uniformly pick one of the triangulations that is specifically topologically equivalent to a sphere. 


Fact 83 
Two settings in which triangulations were considered were the four-color theorem of graph theory and calculating 


expected traces of Gaussian matrices in physics (which ends up summing over random maps). But what changed 


the subject is when Schaeffer found a bijection among geodesic trees in planar maps and a way to describe the 


probabilistic law of those trees in the limit. These tree techniques are still the main way that connect the stories 


of planar maps and of Liouville quantum gravity. 


We'll describe one of these bijections right now. Start with a planar map, as shown below — think of the complex 
plane as a topological sphere, so that there is one large face on the outside. (All pictures are taken from the paper 
https: //arxiv.org/pdf/1108.2241.pdf.) 


41 


Now for every face in our planar map, draw a red dot, and connect those red dots to all of the vertices on the 


boundary of the face: 


Now notice that this is actually a quadrangulation — erasing the black edges gives us a set of green quadrilaterals,each 


made up of two triangles in the picture above: 


Because F — EF + V is invariant (for planar maps on spheres) by Euler’s formula, and we have a 2-to-1 bijection 
between edges and faces when we have quadrilaterals, we can determine the number of faces, edges, and vertices just 
by knowing that we have a quadrangulation of N quadrilaterals. Also, notice that quadrangulations are always bipartite 
graphs, because if we draw any cycle in the graph, it encloses some number of quadrilaterals, so the number of edges 
on the loop has to be even. (We can count the number of (edge, face) pairs (where the edge belongs to the face) 
and the edges in the middle are counted twice, so there must be an even number of edges on the outside. ) 

Next, suppose we decorate our planar map by a spanning tree, so that the edges of the spanning tree are solid 


black and the others are dashed: 
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If we add the green edges back, some of our quadrilaterals will have solid black lines, and some will have dashed 
lines. We now flip the dashed lines, and we notice that they will form a new tree (whose vertices are the faces of the 


original planar map): 


This is the dual tree of our original spanning tree — we can check that if we had a cycle in the original set of solid 
black lines, then we'd have two disconnected components in the dashed picture, and vice versa. We can now trace an 
interface between the tree and its dual — every triangle has two edges that are green, and one which is solid or dotted. 


So each time we enter a triangle from a green edge, we exit it from its other green edge: 


Finally, notice that if we ignore all of our colors and keep all of the edges, we get a triangulation, but because of 


the existence of this traced interface, this triangulation has a Hamiltonian cycle on its faces: 
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We can look more carefully at the planar map walk, though, by picking a starting green edge on one of our triangles 
and then walking through the faces with the rule as before. Each green edge will have one blue endpoint and one 
red endpoint, and at each step, we can keep track of how far those blue and red endpoints are from the starting blue 
and red endpoints on our starting green edge (where distance is measured in terms of “number of edges on the tree’). 
Since each new green edge we visit is connected to the previous one, either the blue distance or the red distance will 
remain the same, and the other one will either increase by 1 or decrease by 1 (depending on whether we get closer or 
farther from the root). 

That process gives us a random walk on Via if we plot (blue distance, red distance), and it turns out this is actually 


a bijection with the set of rooted planar maps. 


Fact 84 
In the paper linked above, Professor Sheffield called the Za. walk a “hamburger-cheeseburger” situation, where 


moving to the right or left (looking at blue distances) correspond to producing or ordering a hamburger at a 


restaurant, and moving up or down (looking at red distances) correspond to producing or ordering a cheeseburger. 


To produce the inverse map, we just figure out how to glue the next triangle to our picture given the next step 
in our walk. One helpful way to visualize this process is that the dashed lines will always be on one side of our planar 


map walk, and the solid lines will always be on the other side, and another is the picture below: 


hag 
LoS. 


—_@— 


The idea is to start with a row of blue and red dots, and then at each step, we can either delete or add a blue 
dot, and we can either delete or add a red dot (corresponding to getting closer or farther from the blue root and red 
root). So each of these four triangles corresponds to one of the four directions of our via random walk, and then we 


can basically just glue the triangles so that the edges and vertices match up. 
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But now we can think about the same process, but this time we will not require that our solid edges form a tree: 


Remark 85. This turns out to be related to the FK (Fortuin—Kasteleyn) random cluster model, in which we 
construct a random graph in which each edge has some probability of appearing, where weighting is proportional to 
xnumber of components oy ynumber of edges Then every time we get an extra cycle in our planar map, we either break apart 
our graph or the dual graph, so the number of extra components is the same as the number of extra cycles. So we 


can ask for the probability of a given picture to be proportional to zmber of planar map loops for some z. 


Our planar map walk will now not be able to visit all triangles unless we cross some solid or dotted lines, so we will 
modify our procedure. Specifically, if we have a cycle of solid or dotted edges and we get to the last element in that 
cycle, we turn the edge the other way in our quadrilateral and flip it from solid to dotted or vice versa (coloring those 


edges yellow): 
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If we return to the hamburger-cheeseburger setting, we can do the Ze. random walk again. Notice that every 
time we pass by a solid edge for the first time, we're making a hamburger, and then when we pass by that solid 
edge again, we eat a hamburger. But when we modify our random walk and add yellow solid edges, the point is that 
every cheeseburger we make between our two visits to the yellow solid edge must get eaten within that time. 
Continuing the analogy, many restaurants have “burger chutes,” where new burgers are put on the top of a stack 
(rather than a queue) and the burgers taken by customers are also grabbed from the top. So what we're saying here is 
actually the following claim: a yellow edge (solid or dashed) corresponds to a hamburger or cheeseburger being 
taken from the top of the stack. 

But looking at the picture, it's difficult to see without colors whether the yellow edge Is solid because it was flipped 
as a result of a cycle or if it was just there on its own. So instead of labeling that step as “eat hamburger” or “eat 
cheeseburger,” we can label it as “eat fresh burger,” so we now have a word of 5 elements instead of 4 (make C, eat 
C, make H, eat H, or eat F) where we do end up with no burgers at the end of the word. The punchline here is that 
we get a bijection between the set of planar maps with any distinguished edges (not necessarily a spanning tree), 
and the set of words with conditions as above. So using the more complicated FK model we mentioned is basically 
assigning a different probability to the ‘eat fresh burger” element! And in the limit, it turns out that this Z* walk 
becomes a two-dimensional Brownian motion with different diffusive terms along the y = x and y = —x axes, and the 


relative rates there depend on the fraction of “eat fresh burger.” 


17 November 4, 2021 


Last time, we discussed planar maps (in particular, corresponding them to quadrangulations, labeling them with 
spanning trees and getting corresponding spanning trees for the dual graphs, performing a random walk that travels 


between the two trees, and corresponding that walk to the “hamburger-cheeseburger” setting). 


Fact 86 


If we try this process with a planar map which is just a tree with three vertices (so that the dual graph has 


just one vertex), then this Z%. walk will return to the origin in the middle of the walk (because it will travel 


(0,0) > (0,1) + (0,0) — (0,1) — (0, 0) — the idea is that that multiple edges between a blue and a red vertex 
are possible in our framework). This can be understood best by thinking of our four “building block triangles” 
which each either add or remove a blue or a red vertex. (In particular, a planar map which is a tree of two vertices 
with an additional self-loop will give us a random walk (0,0) > (0,1) — (0,0) — (1,0) — (0, 0).) 
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Having these degenerate cases may seem unideal, but they are necessary because of the different kinds of triangu- 
lations that are studied by mathematicians (depending on whether we have self-loops and whether we allow multiple 


edges) — we need to allow all kinds of planar maps for the bijection we talked about last time to work. 


Remark 87. Recall that when we allowed our solid black edges to have cycles, we needed to add “eat fresh burger” 
orders to our instructions, corresponding to flipping an edge to become yellow. Doing so changes the shape of the 


random walk based on the probability p of a fresh burger order. It turns out there's a phase transition at p = 4 —in 


the scaling limit, for p larger than 4 we just have a Brownian excursion in one direction instead of two (because we 


never get significant imbalances between the number of cheeseburgers and hamburgers). 
We'll now turn our attention to a more general discussion of universal random structures in two dimensions: 


- We start with trees. Recall that if we start with a rooted tree, we can trace Its boundary and get a sequence of 
“up” and “down” steps (corresponding to moving closer to or farther away from the root) and get a Dyck path. 
This process can be reversed — we can recover the original tree by drawing chords under the Dyck path and 
gluing edges together if they're connected by a chord. We can similarly construct a continuuum random tree 
from a Brownian motion (introduced by Aldous in 1993), which gives us a random metric space. Basically, we 
construct an equivalence class on a Brownian excursion where two points connected by a chord are identified, 
and then the distance between points Is basically the minimum vertical distance we need to travel to get from 


one to another. 


* Next, we can think about the SLE curves we've been talking about, which have the conformal Markov property. 
There's also a radial version of SLE which we haven't discussed much, where we grow a path from the boundary 
of a disk to the origin (still using conformal maps g; similar to the ones we've already been discussing in the 
half-plane setting, and where O takes the role of oo — this time Vi tracks a boundary point moving on the 
circumference on the disk, and it moves as a Brownian motion e’V*® on that circumference). These objects 
come up in settings like the maze generator and Wilson's algorithm — for example, if we take a uniformly random 
maze and look at the path that takes the center of the maze to the boundary, in the limit, the law of that path 


will be a radial SLE curve with « = 2 (so it has fractal dimension 1 + 2 a 3).! This can be proved by trying 


to compute a martingale similar to the conditional height of the free field with +A boundary conditions — this 
one will be a harmonic extension of the function which is some constant c at the tip and O on the rest of the 


boundary, chosen so that the function is 1 at the center of the maze. 


Note that showing that a scaling limit exists often requires some work of its own, but some limits in these kinds 
of settings can be found from compactness. For example, if we look at a random subset of the box of our maze 
(like our random curve), we have a probability measure on random subsets of the box for each mesh size, and 
then we can argue that there is a subsequential limit. For example, probability measures on [0,1] must have 
subsequential weak limits (look at the cdfs, find a subsequence such that the 5 cdf values converge, then the 
; and ; ones converge, and then use Cantor diagonalization on the set of all dyadic rationals), so the same thing 
works for squares (seeing whether the set intersects small rectangles with dyadic rational endpoints). However, 
the limiting object doesn't need to be a curve or anything like what our original objects were, so even more work 
is needed to proceed from there (and then we need to show that subsequential limits all agree). That's why 


there's only a handful of papers that have actually been published with a scaling limit theorem. 


(And quickly returning to the maze again, &K = 8 is the correct value for the curve that is traced out in the limit 


by the boundary of our uniform maze — it's the smallest « for which the curves fill space.) 


Oded Schramm would sometimes say that “sometimes the answer to a question which begins with “why” is just to do a calculation, so 
we shouldn't always expect that there’s a reason for why we take a particular value of & in settings like this. 
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+ We can also glue together quadrilaterals or other shapes to form a random surface (also random metric space), 
as we mentioned last lecture — these converge to a random metric space called the Brownian map, which is 
homeomorphic to the 2-sphere but has Hausdorff dimension 4 and also has connections to Liouville quantum 
gravity. (This convergence was proved by looking at geodesic trees.) One interesting thing we can do with the 
resulting triangulations is to correspond them to circle packings in the plane — this circle packing is essentially 


a discretization of a conformal map, because it tells us about local radii at each vertex. 


We can then ask about the scaling limit of our picture formed from a random planar map, looking at what 
happens to the curves in the limit — we should have SLE> curves for the red and blue trees and an SLEg curve 
between them, and it turns out that we get a Liouville quantum gravity sphere with y? = 2 when we glue our 


two trees together. So this picture really connects a lot of objects we've studied together! 


18 November 9, 2021 


We'll continue some of the stories from last time — today, we'll talk about Julia sets. The way these work is that we 
consider the map ¢(z) = z? in the complex plane, which maps the unit disk to itself (in a 2-to-1 way) and thus maps 
C \ D to itself in the same manner. Applying this map repeatedly makes points in C \ D drift to infinity, and we can 
also construct a similar 2-to-1 conformal map from C \ K to itself for any compact K with connected hull by using 
the Riemann mapping theorem. 

It turns out that there are certain sets K where this conformal map is very simple, and the idea is to look at the 
function 6«(z) = z? +c (the next simplest conformal map) and take the set of points K where repeated iteration of 
ox keeps our points bounded. (So if c = 0, we just have the usual disk.) This set K is called a Julia set (closely 
related to the Mandelbrot set, in which we fix z = 0 and look at the allowed values of c), and there are many good 


visualizations of Julia sets which we can find on Google. 


Fact 88 


For some values of c, the Julia set looks tree-like (dendritic), while for other values, it has more filled-in regions 


(though in both cases the set will be closed). These sets also exhibit self-similarity, because a region in the Julia 


set has two pre-regions (and locally conformal maps look like dilations and rotations). 


Random planar map theory is often motivated by concepts in complex dynamics like this one — for example, we 
can look up conformal mating, which involves taking two of these sets K and gluing them along their boundaries 
(looking at the measure pulled back from the conformal map to the circle). The result turns out to be a topological 
sphere, and we want an embedding of it in space such that some rational function preserves the dynamics (it maps 


the boundary set to itself in a 2-to-1 way). 


Fact 89 


We can see this procedure in action at https://www.math.univ-toulouse.fr/~ cheritat/MatMovies/, in which we 


glue one fractal set to the inner part of an annulus, the other to the outer part of the annulus, embed the annulus 


in the sphere, and shrink the width of the annulus. 


This process looks different based on whether we have filled-in or dendritic trees. For instance, the latter tells us 
how to identify points on the latter if we try to glue a filled-in and dendritic tree together, and we get a space-filling 


curve on the sphere (reminiscent of previous classes’ interface between trees) if we glue two dendritic trees together. 
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Fact 90 

Recall that continuum random trees can come from Brownian excursions, where we glue together horizontal 
chords. Then we can identify two random trees by running Brownian excursions X,Y on [0, 1], picking some C so 
that the graphs of X and C — Y don't overlap, and then identify vertical points together. The resulting shape is 


also a topological sphere, and the boundary is a space-filling path, but we need Moore’s theorem to show this 


(in particular, it's important that the two trees aren't exactly identical because then we get back a tree). 


Remark 91. Note that we can create a Brownian excursion from a Brownian bridge By on [0,1] by finding the point 
s € [0,1] where the bridge achieves its minimum, extend the Brownian bridge periodically, and then take B, — B, for 


t € [s,1+s] to get our excursion. 


If we now return to Liouville quantum gravity, which is the object e’4)dz for a Gaussian free field h (more 
specifically we need to take a weak limit eV /2eth(2) dz as € > 0), we can draw our measure by repeatedly subdividing 
a square into its four pieces and stopping the process for a given square when its measure (under the LQG) is smaller 
than some threshold 6. The story we can keep in mind here is that this is similar to taking a random quadrangulation 
and conformally mapping it to the sphere, looking at the sizes of the resulting images of quadrilaterals (this is also 
additionally connected to circle packings having varying radii). Specifically, we get a random measure when we take 
the mesh of the quadrangulation to be small, and we also get one from the LQG, and these should be locally the 
same. A result like this has been proven for uniformly random triangulations (with a particular kind of embedding), 
and there are various other models in which we can study convergence of a pair of trees, but more generally results 
are not known. 

Now we can take our set of LQG squares and consider a loop-erased random walk. Recall that when we uniformly 
pick (M, T), where M is a planar map and T is a spanning tree, then the probability of M is proportional to the number 
of spanning trees it has, which is related to the determinant of the Laplacian on M. More generally, we can weight by 
different powers of the determinant of Laplacian, and in fact recall that this corresponds to changing the value of ‘¥y in 
our LQG. 

For an ordinary Euclidean grid, as the mesh size 6 — 0, the path should be independent of the LQG and look like 
a time-changed Brownian motion, so it is an SLE» curve. The fractal dimension of this curve is 3, meaning that we 
expect a curve to hit e~5/* boxes in our domain if we have a regular grid of boxes of side length e. But now let's 
look at the SLF> curve on our Liouville quantum gravity boxes (where each box has area on the order of 6, so there 
are about ; total squares in the picture). Based on our discussion earlier about Brownian excursions, taking y = V2 
(corresponding to & = 2) in our LQG, we should get about VN of the N squares (meaning we get about 6-1/2 boxes), 
because it’s morally the same as a loop-erased random walk. And the idea behind the KPZ formula is then that we 
can relate these exponents 3 and $ in the expected way, and that’s what convinced Polyakov (who came up with the 


LQG construction) that the planar map story and the LQG story are the same. Here’s what the KPZ formula says: 


Theorem 92 (KPZ formula) 


Let A be the scaling exponent (see below) in the Liouville quantum gravity grid, and let x be the scaling exponent 


in the usual Euclidean grid. Then 


This is indeed true for y V2, 5.x 1- 3 2 as in the case above — note that we use x = 2 because there 


is a probability e3/4 = (e2)/8 that we hit a given square in the Euclidean grid, and similarly there is a probability 51/2 
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that we hit a given square in the LQG grid, and those probabilities are how we define the scaling exponents. But what 
we're saying is that given any fractal curve, this KPZ formula should hold — it doesn’t depend on the randomness of 
the fractal itself. 


Remark 93. We can think of this with the following analogy: if we do a random road trip in the United States, we 
expect to see more people if they're uniformly distributed by population (“Euclidean grid”) rather than clustered in 


cities ("Liouville quantum gravity”). And the value y just tells us the degree to which clustering occurs. 


19 November 16, 2021 


We'll start today’s class with a puzzle involving dimer models in three dimensions (a generalization of domino tilings of 
a square grid). We mentioned in a previous class that there’s a bijection between spanning trees and dimer models in 
certain corresponding grids, and then we can look at the height function for this model, where we mark the vertices 
of a perfect matching black and white in a checkerboard manner. We then assign values to the faces by increasing by 


1 when we go counterclockwise around a black vertex without crossing an edge and decreasing by 3 when we do cross 


an edge: 
2 3 2 
1 4 9 1 
2 3 I: 2 
1 4 ‘4 1 
2 3 2 


This means that we'll end up with a height difference of 3 across edges and a height difference of 1 otherwise — we 
can notice that the value mod 4 Is fixed regardless of our edge configuration, but we can do a local move by taking two 
edges that are parallel and adjacent and rotate them, which will just change the height on a single face (for example, 
we can push the 6 in the middle down to a 2). We can then get a one-to-one correspondence between matchings and 
functions satisfying the edge-distance constraints, and one question we might be interested in is whether we can get 
from one matching to another using local moves. The answer is yes — we can look at the distance between the height 
functions of the tilings at each face, find a place where that distance is maximal, and then show that we can decrease 
the larger one to make the height functions closer to each other (because if we look at all of the points where maximal 
distance is achieved, one of them must be a local maximum for the larger height function, so then we can do a local 
move to bring the height function down by 4). Specifically, among all faces f where hi(f) — ho(f) is largest, find one 
where /(f) is larger than all of its neighbors — this implies that we can do a local move and send hy(f) +> hy (f) — 
So eventually we'll reach the minimum, which is where the two tilings are the same. 

Thus, if we want to choose a perfect matching at random, we can do a Monte Carlo Markov Chain process: just 
start with some perfect matching and repeatedly perform local moves (randomly picking a face and pushing it up or 


down), and this will eventually mix and converge to the uniform distribution (which is stationary here). 
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Fact 94 
Notice that if we overlay the matchings from hy and hy on top of each other, we get a collection of cycles and 
double edges. Crossing a cycle changes the value of hy; — ho by 4, so this height function difference essentially 


tells us within how many cycles a face is nested in. 


Problem 95 


Can we do this process in three dimensions as well? In other words, given a matching on an N x N x N grid, can 


we get from one to another using a sequence of local moves? 


Here, the answer depends on which moves are considered local — if we only allow for the two-dimensional local 
moves above with two parallel edges, the answer is no. Here's a counterexample: consider a 3 x 3 x 2 grid where the 


bottom and top layers look like the 3 x 3 grids shown below (and the two center dots are also conncted): 


oe oe 
@ e 
ee e—_e 


Then there are no local moves that can be made at all, so we can't get from this configuration to any other one. 


Remark 96. Domino tilings in dimension 3 have been studied (for example by Nicolau Saldanha), and we know for 
example that there are 5051532105 tilings for a 4 x 4 x 4 box (but there aren't nice formulas at all). Essentially, 


higher-dimensional dimer tilings lose a lot of nice structure that exists in two dimensions. 


Instead, we can consider a larger collection of local moves: the two-dimensional local moves can be thought of as 
taking a 4-cycle in our grid where every other edge is included and swapping which edges in the cycle appear in our 
matching (also known as “flips’). We can do something similar in three dimensions by allowing for a cycle of length 6 
(also known as “twists”), which is what Saldanha does in his work. So the question is whether we can get from any 


matching to any other matching using flips and twists, and that’s something we can think about! 


Fact 97 


Dimer models are closely connected to determinants and non-intersecting paths — if we overlay our two-dimensional 


matching (marked in black below) with the matching which has all horizontal edges but no local moves (marked 


in blue below, so that the height function maximally increases), we end up with a sequence of non-intersecting 


paths with some nice properties. 


The idea is that after each blue line in a path, the matching dictates whether the path moves upward, downward, 
or to the right, and local moves can be thought of as pushing our paths upward or downward. This same process can 
be done in three dimensions as well, and we get non-intersecting paths where there are five. But thinking of untangling 
these non-intersecting paths in higher dimensions is more difficult! 

We'll now turn back to the topics in our class, looking at the Eden random growth model (introduced by Eden 


in 1961 as a way to think about cancer growth). 


1. One point of view is where we have a graph (V, E), such as Z?, and assign an exponential random variable edge 
weight to each edge (which we can think of as the time needed to cross a given edge). Then we might be 
interested in the set of vertices which can be reached after some total time NM — this gives us a shape which is 
similar to a ball but with additional roughness, and it turns out that the limit shape is convex but not actually a 


Euclidean disk (because Z? isn’t isotropic enough). 


On the other hand, it was shown that we do get a limiting Euclidean disk if we replace Z? by a lattice which 
is rotationally invariant (a property that we often want in this subject), but grids tend to not be rotationally 
invariant because they're only symmetric under very specific rotations. We can fix this by constructing a random 
lattice: rain down a Poisson point process on R? (which will be rotationally invariant) and construct the Voronoi 
tesselation (which makes polygonal faces out of sets of points in the plane closest to a given point in our Poisson 
point process). We can then do our Eden model growth on this graph of polygonal sets, and then we end up 


with a sphere in the limit because we have rotational invariance in law. 


2. Another point of view for this growth model comes from the fact that the exponential random variable is 
memoryless, and this perspective is more similar to the cancer growth motivation: at step n, we can think of 
having a cluster C,, of visited points, and then we grow to a cluster C,+41 by selecting a uniform vertex from the 
boundary OC, and adding it to our cluster (and the memoryless property tells us that we can do the next step 
independently of the current one). Basically, choosing this vertex is like waiting for the first exponential clock 


adjacent to our cluster ringing and adding it. 


Variants on this model are also possible: we can adjust the weighting of our edges by running a random walk 
that starts far away from the cluster and adding the first point on OC, that the random walk hits. In this model, 
growth leads to more growth in the same directions, and this is known as the harmonic measure (leading to 
diffusion limited aggregation or DFA). We can then generalize this further by choosing growth according to a 
power 7 of the harmonic measure, and this gives us 7-DBM -— the dielectric breakdown model — from physics. 
The picture under DLA has much more dendritic growth than under the uniform measure, because once we 


start forming a dendrite it becomes difficult to reach inside it. 


These DLA clusters do indeed appear in nature — searching “DLA cluster” on Google gives us many images. 
Simulations are easy to do for these types of models (and there are many papers in physics journals about 
DLA), but it is hard to prove mathematical statements about them. We don't know about large-scale behavior, 
like whether the shape is random, or whether it has a scaling limit, or what the asymptotic dimension is (the 
simulation prediction is approximately 1.71). And Professor Sheffield has done some analysis on DLA on random 


planar maps and LQG surfaces, where some additional progress can be made. 


As promised, we'll now turn to some references (mating of trees: https://arxiv.org/pdf/1409.7055.pdf, quan- 
tum Loewner evolution: https://arxiv.org/pdf/1312.5745.pdf, LQG metrics: https: //arxiv.org/pdf/2109.01252.pdf) 
which go into more detail about some topics we've discussed during class. Concepts that are covered include welding 
quantum wedges together and creating an SLE curve in the resulting boundary, embedding our gluing of Brownian 


excursions in a harmonic way to get a Liouville quantum gravity surface, using these concepts to think of SLE, curves 
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for K > 4 as coming from gluing of certain trees, and performing an Eden model growth on a planar map (adding 
triangles one at a time, where it turns out we just need to keep track of the boundary length for conditioning on future 
growth). This idea of “growing for a while, then resampling the tip, then restarting” gives us the quantum Loewener 


evolution QLE(y?, 7) for certain values of y and 7. 


20 November 18, 2021 


Example 98 


We'll start by asking a quick question about how Liouville quantum gravity is connected to the words “quantum” 


or “gravity” at all. 


In quantum physics, there's a notion of the Feynman path integral, where (loosely) we treat Brownian motion as 
a measure on paths, and then we have a quantum system where a particle can take all such paths so we must integrate 
over all of them. Then the Schrodinger equation is a variant of the heat equation (just with an extra /, so instead of 
smearing out a wavefunction it evolves in a unitary manner), and it helps us understand behavior of a single particle. 

But if we have larger systems, we may have to integrate over spacetimes rather than over a one-dimensional 
trajectories, and depending on the masses and their positions, Einstein’s equations basically gives us different curvatures 
for these spacetimes. It turns out writing Einstein's equations on 1+1 dimensional spacetime gives us an action, and 
then under a certain formalism where we quantize to a lattice approximation and weight by e~(©Per9Y), we essentially 
end up with Liouville quantum gravity. So the original name of LQG came from physics, even before all of these objects 


were well-defined, and there's also a secondary explanation from string theory as well. 


Example 99 


Before we begin with the main part of the lecture, we'll return to the problem about local moves on three- 


dimensional lattices that we introduced last time. One interesting story is that we can consider local moves on 


a two-dimensional torus: if we add all edges that wrap around our rectangular grid, we can ask the question of 


whether we can connect perfect matchings on a torus. 


It turns out the answer is not quite — remember that when we define the height function fh on the faces between 
our vertices, we are essentially keeping track of the “amount of flow” crossed between black and white vertices (colored 
in a checkerboard manner). Specifically, we can think of a filled vertex as producing 3 units of flow on an edge towards 
its corresponding empty vertex and —1 units towards the other vertices, and h sums up those flows as we travel around 
the faces. Stokes’ theorem tells us A is well-defined in the rectangular grid case because the sum of flows around each 
loop must be 0 if the divergence is 0 at each vertex, but this no longer holds true in the torus case (consider the 
all-horizontal perfect matching where there are no sets of parallel edges, like in last lecture). So h now needs to be 
multivalued for anything to make sense, but local moves still do not change the value of h. 

In other words, this ends up being a question about homology — if two matchings have different flows in the 
horizontal and vertical direction (where we calculate flow by adding up the flow across a loop of vertices that goes 
around once horizontally or vertically), then they cannot be connected. But within each homology class, the same 
argument with the difference of height functions hy — hy that we did in the rectangular grid case still works, because 
the “wrap-around gain” cancels out. 

So now we can think about the same question in an Nx Nx N torus, asking whether we can connect two matchings 


in the three-dimensional case using local moves if they are in the same homology class. One interesting case we can 
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consider is to have all edges only along the x- or y-directions, where successive z-layers have edges that all point in 
the —x, then —y, then +x, then +y directions from black to white in a “bricklaying” pattern for each layer. Then 
local moves are not possible, and there are also no twists (because there are no edges pointing up or down), but the 
net amount of flow in each direction is zero! So the homology class of this matching is zero, but it’s not connected 


to other zero-flow matchings. 


Fact 100 


We can also define a flow in three dimensions (thinking about filled-to-black edges as having 5 units of flow and 


empty edges as having —1 units of flow), but we cannot define a height function in the same way 


One interesting property of this bricklaying matching on the torus is that even if connecting different matchings 
on the torus is possible, we may need very large local moves to do so (for example, if the bottom n/4 layers are all 
pointing in the —x direction, then the next n/4 in the —y direction, and so on, then there’s no way to form a small 
cycle where every other edge comes from our matching edges because they're all moving in the same direction). And 
so It makes sense to try and use this kind of argument of “creating large flows” and mimicking the construction on our 
three-dimensional rectangular grid, but it’s not clear whether this will work or not. 

We're now going to turn back to the models of random growth and random surfaces we've been discussing, and 


we want to connect everything back to scaling limits of discrete systems. 


Fact 101 

If we look at metric balls on LQG surfaces, those balls have rougher and rougher boundaries as ‘y increases over 
the range (0,2), because large values of h make it harder to traverse certain regions of the plane. So conformal 
embeddings for large y will give us surfaces that look a lot like trees (because most of the plane is reached in a 


very short amount of time, and then we have small pockets that take a long time to be reached). 


One thing we were trying to make sense of last time was the QLE( 2,7) process — remembering that we can 
construct certain processes by running an SLE curve while repeatedly re-randomizing the tip at points in time, and 
the case (y?,) = (8,0) (which can be constructed this way) is also related to the Brownian map. Recall that we 
have the relation y? = Kk, and K = 8 is the SLE curve which is the scaling limit of a self-avoiding random walk or 
the boundary of a percolation cluster. So to understand the Brownian map, we'll start with something discrete and 


concrete: 


Definition 102 
A dancing snake is a random walk starting at the origin (0,0) and making steps of either (1, 1) or (—1, 1). Snakes 


can evolve in a few ways: they can grow by adding another step (to the left or the right), or they can shrink by 


removing the topmost edge. 


We may also do a Brownian motion version of this dancing snake, where the Brownian motion grows upward instead 
of to the right up to some coordinate T. (This is essentially the scaling limit of the simple random walk, where we 
have many edges in our dancing snake and then renormalize so that we have the correct variance properties.) Then 
the head of the snake will move up and down according to a Brownian motion, where moving up means we add more 
Brownian motion and moving down means we erase part of the Brownian motion that we already have. To be more 
rigorous, we can construct a regular Brownian motion for the head height, and then given that height we can glue 


together chords to make a continuum random tree (like we did with the Brownian excursion from a few lectures ago). 
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Then for the horizontal position, we just need to define a Brownian motion on the continuum random tree, which is 
not too difficult — we just do it branch by branch independently, and because this is a Gaussian process, we can specify 
the covariances between two points to be the length of the common segment to the root of the tree. The Brownian 
snake's x-coordinate is then obtained by tracing out the boundary of our continuuum random tree and looking at the 


values of the Brownian motion on the tree. 


Remark 103. This construction is that of a super-process (introduced by Le Gall), because we're indexing the position 
of our Brownian motion using a random tree. Also, as a bonus, this model of the Brownian snake Is also the scaling 


limit of our hamburger-cheeseburger model when we always order a fresh burger. 


So we have the x- and y-coordinates of our Brownian snake's head, and these give us two functions of time — let’s 
now assume that the y-coordinate is specified by a Brownian excursion (so that the height always stays positive, and 
we return to our starting snake location at the end time). We then get two random trees by joining chords across time, 
and then these two trees can be glued together — the resulting object is the Brownian map, and Moore's theorem 
shows us that this gives us a topological sphere. 

This Brownian map actually comes with a metric — we'll say that the x-coordinate tree is a geodesic tree, and 
then we'll look at distances along the y-coordinate tree by looking at the effects of the gluing (which will create some 
identifications on the geodesic tree). There are still some aspects of this definition which may seem shaky, though — 
we haven't shown whether this map degenerates to a single point. But to show that this doesn’t happen, we can just 
prove that when two points agree under gluing, they have the same distance to the geodesic tree’s root. (This is true 
because of the way our Brownian snake is defined.) So there's a lot of subtlety in this construction, and we'll see more 


about it later on! 


21 November 23, 2021 


We'll discuss the Brownian map more today — we've been viewing Liouville quantum gravity as a family of surfaces 
parameterized by y, specifically the weak convergence of measures 


r a 
lim e% (26M) gz 
e>0 


We can think of this as conformally mapping a random surface to a sphere and looking at the induced area measure 
(given some region in the plane, we look at the measure in its preimage), so we have a surface with a measure and some 
conformal structure. It turns out this procedure works if we're trying to compute lengths along the surface (just with 
a different normalization), and it also works if we want to look at the length along a Brownian motion (independent 
of the free field) run on the surface, so we can get the right time-parameterization for it. But to describe the surface 
completely, we need a distance function — the idea is that the Brownian motion will get us that metric directly (along 
with the surface and its measure), just without the conformal structure. 

Recall from our “dancing snakes” story that identifying chords of a Brownian excursion gives us a random tree 
where we get a Brownian motion along each segment of the tree. We'll look at something related here: our goal is 
to show that if we pick a large planar map uniformly at random from the class of p-angulations (all faces having p 
adjacent edges) and use the graph distance as our metric, we get the Brownian map in the limit. If we let M, be 
such a planar map (with n faces) rooted at some vertex and edge, letting V(M,) be the set of vertices and dg, be 
the graph distance, we can write down the distances from each vertex in M to the root, and then (V(M,,), dgr) is 


a (random) metric space. This metric space lives in the set of compact metric spaces, modulo isometries, and that 
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set can be equipped with the Gromov-Hausdorff distance, which is a way of determining distances between two 


metric spaces: 


Definition 104 


Let K,, Ko be compact subsets of a given metric space. The Hausdorff distance between K, and Ko Is 


duaus(K1, Ko) = inf{e > 0: K, C U,(K2) and Kz C U,(K1)}. 


In other words, the Hausdorff distance is the smallest € so that all points in K, are at most € away from a point in 
Ky, and vice versa (we can think of K; as the set of people living in a town and K2 as the set of restaurants). Note 
that this definition only makes sense if we have a common metric space K that both K; and Ko exist inside, so that 


motivates the next definition: 


Definition 105 


Let (£1, d,) and (E>, d2) be two compact metric spaces. Then the Gromov-Hausdorff distance is 


deH(E1, E2) = inf {dyaus(Wi(E1), W2(E2))}, 


where the infimum is taken over isometric maps 1 : Ey > E and Wo: Eo > E into a common metric space. 


Such maps wz and ws can always be found — we can just map F, and E> into the Cartesian product FE; x Eo, 
or we can take E to be the union of FE; and E> where distance between F; and Eo is large (to ensure the triangle 
inequality still holds). And the Gromov-Hausdorff distance wants us to find the best way of making EF; and E> look 
close to each other while still preserving all of the important distance properties — notice that if the distance between 
two such compact metric spaces is 0 if and only if (E,, d,) and (Eo, d>) are isometric. But to show that, we do need 
to check some conditions using compactness — one thing we can note is that forming a countable dense subset of Fy, 
by adding one point at a time will give us a sequence that converges to FE, in the Gromov-Hausdorff distance. 

So if we now take the set of compact metric spaces that are distance at most 1 from a fixed metric space, we 
have a measure space with a Borel sigma-algebra (generated by the open sets), so it indeed makes sense to talk about 
a “random metric space” in the separable complete metric space (K, doy), where K is the set of isometry classes 
of compact metric spaces, and it makes sense to study convergence in distribution of the object (V(M,), 1 *dgr) 
for some power a and for the graph distance dy, so that the diameter of V(M,,) scales as n?. We pick a = ; for 
quadrangulations (this makes sense if we think about the fluctuations in the dancing snake for a Brownian excursion 


of length nm), and more specifically we have the following result: 


Theorem 106 


1/4 
Let p= 3 or p> 4 even. Then for some constants cp (specifically cs = 61/4 and Cp = (aes) otherwise), 


P(p—2) 
(Vimy), Cp =a gy) converges in distribution to a limiting compact metric space (m,, D*), in the Gromov-Hausdorff 


sense, called the Brownian map, not dependent on p. 


There are some interesting notes about this result: first of all, the Brownian map was first shown to exist as a 
subsequential limit but wasn't proven to be unique for a few years, so the language “the Brownian map” meant “one of 
the possible subsequential limits.” And also for a while, the dancing snakes story (constructing the continuum random 
tree and identifying pairs of points via gluing) and the p-angulations definition weren’t able to be connected either! 


But eventually these things were resolved. 
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We'll talk a bit more about how we work with the continuum random tree here — we define a real tree to be a 
compact metric space where we can join any two points with a unique continuous and injective path isometric to a 
line segment (so this object can have infinitely many branching points and leaves). For example, if we look at the tree 
formed by our Brownian excursions, any local minimum will be a branch point, and there can only be countably many 
of these within a finite-length excursion, but there will be uncountably many leaves (because the leaves are the places 
in the Brownian excursion where we can draw a nonzero-length chord containing that point). Any real tree can be 
encoded by a function g : [0,1] — [0,00) such that g(0) = g(1) = 0 in a similar way as we do with the Brownian 
excursion, and we define the CRT (continuum random tree) to be the one coming from Brownian motion. 


We can now assign Brownian labels to our real tree as follows by constructing a Gaussian process {Z,} indexed by 


our real tree (Te, d): the root vertex p satisfies Z) = 0, and for any two points a, b on the tree we have E[(Z,—Z»)?] = 


d(a, b) (so by the polarization identity we can deduce the variance between any two points). We can notice a few 
properties: two vertices a, b are identified on our CRT if Z; = Z, and we can go from a to b in the tree by only visiting 
vertices with label at most Z,, and almost surely each equivalence class has at most 3 points because local minima do 
not coincide exactly. (In other words, we only have simple branching points. ) 

If we now take two points a, b in the real tree, we can define 


D® (a,b) = Z5+ Z, — 2max ( min Z., min Z.) , 
c€[a, b] c€[b,a] 


where [a, b] denotes the set of vertices we visit when we travel clockwise around the tree from a to b (we can think of 
this, on the original Brownian excursion, as going down from a and b until both points are connected via a common 


chord). We can then use this to define a distance on the tree via 


D*(a, b) = inf pe DO" (ap24 18). 


40=4,a1,°°* ,AK= 


If we define the equivalence relation ~ where a~ b if and only if D*(a, b) = 0, then we can also define the Brownian 


map in this way: 


Definition 107 


The Brownian map m,, is the quotient space 7-/ ~, where Je is the continuum random tree. 


We'll end with a few properties of this Brownian map: it has Hausdorff dimension 4 almost surely (we can think 
of covering the Brownian excursion / dancing snake with e-balls), and it is homeomorphic to S? almost surely. Thus, 
one implication of this is that if we have a planar map M, of n vertices, then there's no separating cycle in M, of size 
o(n'/4) in M, such that both separated components have diameter at least en!/*. But we'll think more about the 
bijection between a discrete quadrangulation and a discrete set of trees (bringing together our two definitions of the 


Brownian map) next time! 


22 November 30, 2021 


Fact 108 


We started the class by looking at some past final projects, giving examples of topics covered and showing that 


this project has been a springboard for past students’ future probability work. 
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We'll continue to work through Le Gall’s discussion on the Brownian map today, still following the slides at 
http://dept.ku.edu/~ math/conferences/2012/ssp/slides/LeGall.pdf. Recall that the Brownian map is a universal 
limiting object (in the space of metric spaces) which comes from the scaling limit of p-angulations, where proofs usually 
come from finding a bijection to a geodesic tree and a dual tree which are then glued together. Specifically, we look at 
the Gromov-Hausdorff topology on the set of metric spaces, giving us generalized notions of sigma-algebras, measures, 
and (weak) convergence of measures (so that we have a space where both a finite set of points (a p-angulation) and 
the continuous Brownian map both exist and can be compared). Recall that the Gromov-Hausdorff distance between 
two compact metric spaces F;, E> is the minimum Hausdorff distance between isometric embeddings of F,, E2 into 
E — this satisfies the triangle inequality because we can glue the embedding of FE, and E> with the embedding of E> 
and £3 together. From this, we mentioned last time that we can study the convergence of (V(M,), n~1/4dgr) for the 
graph distance dg,, where n is the number of faces of the triangulation. 

But we'll discuss the main tool for studying the Brownian map today, which involves finding a bijection between 


maps and trees. 


Definition 109 


A planar tree 7 is a rooted ordered (labeled) tree where children of a root are given different suffixes for their 


labels. A well-labeled tree is a planar trip with an additional labeling 2 of positive integers on the vertices so that 


Lg = 1 (the root is labeled with 1) and |£, — £y| < 1 for any adjacent v, v’. 


Below are examples of a planar and a well-labeled tree, respectively: 


1231 


111 23 


The idea (by Cori-Vauquelin and Schaeffer) is that we have a bijection between Ty, the set of well-labeled trees on 
n edges, and Mi, the set of rooted quadrangulations with n faces. Specifically, given a well-labeled tree with labels of 
é, on vertices v, the corresponding quadrangulation has T's vertices plus a root vertex @, and points v are distance 
£, from that new root vertex. We basically add this @ vertex with @ value 0, and then we follow the (depth-first) 
traversal of the rooted tree that we see on the left below — at each step, we connect our current vertex to the most 
recently visited vertex which had a smaller number than the current one (and we connect to 0 at the beginning). 


Our resulting quadrangulation’s vertex labels then represent the distance to our root 0. 
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We can notice that edges are only drawn between two vertices if their numbers differ by exactly 1, so we must 
have a bipartite graph, and furthermore we should be a bit more careful to check that once we have a sequence of 
three vertices with increasing labels, we must close off the loop to be of cycle 4. The inverse of this bijection then 
involves constructing the geodesic tree that our tree is intertwined with, where we should note that the geodesic tree 
has the same vertices as the original tree. (Specifically, we trace back the paths in our quadrangulation back to the 
root, which are what we call “geodesics.” If there are multiple paths from a given back to the root vertex, we break 
up our vertices into separate vertices, one for each edge. Then our initial well-labeled tree and the geodesic tree will 
be dual.) 


Remark 110. When we think about the Brownian snake associated with these well-labeled trees, note that we now 
have a snake which is required to stay within the first quadrant (since all labels are positive). Alternatively, we can 
describe our tree by looking separately at the y-coordinate (height) of the snake, which is just dictated by the labels 


on our vertices, and the x-coordinate of the snake, which comes from the geodesic tree. 


We can then find a geodesic in a quadrangulation by constructing its corresponding well-labeled tree and looking 
for the leftmost (first-visited) vertex with 1 smaller value of 2, repeatedly doing this until we get to the root. So 
our geodesic tree really gives us the leftmost geodesics, and in the continuum version of this (in the Brownian map), 
geodesics become paths in the metric space. It then turns out that simple geodesics only visit leaves of 72, since 
(as mentioned last lecture) we'll have two distinct simple geodesics at simple points and three at branching points. So 


in summary, this universal limiting metric space has a lot of non-Euclidean properties! 


23 December 2, 2021 


Fact 111 
In these last few lectures, we'll move to discussing the Yang-Mills problem in gauge theory. There's a lot of 
fundamental mathematical points that mathematicians don't understand yet, but it’s an ongoing field of research, 


and this sometimes involves looking at physicists’ perspectives and rigorizing certain computations (or even go 


beyond what physicists “know”). It's interesting to look at the quote at the beginning of https: //arxiv.org/pdf/ 
0808.1560.pdf and also the discussion in Chapter 2 of https: //arxiv.org/pdf/hep-th/9411210.pdf. 


To start, we'll discuss the concept of a connection in differential geometry. Recall that curvature has an effect 
on the tangent spaces at different points on a surface — for example, if we perform parallel transport on a sphere, 
going from the North Pole to a point on the equator, then moving a quarter of the way around the equator, and 
then returning to the North Pole, we'll have rotated by 90 degrees. (This is importantly related to the concept of a 
holonomy group.) Basically, there's a gauge equivalence in that we can define our tangent space to be oriented in 
some particular way at every point, but the amount of rotation when we go around a loop is still well-defined. And 
more generally, we can think about having a group of matrices, where along each direction of potential movement we 
apply some element of that group (in the sphere case above, these matrices would be rotation matrices). Specifically, 
we assign an element of a Lie algebra to each potential direction, and then we multiply those contributions around a 
loop to get the total contribution. The Lie bracket then becomes related to the curvature on the underlying surface. 

This concept of connections is what physicists study in gauge theory and use in the Standard Model — a big question 
to ask is how we define a measure on the set of random connections. To start, we do the discrete version of all of this 


discussion above: consider a d-dimensional lattice, and for each edge on that lattice we assign a random N x N matrix 
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g from a compact group like SOy or SUy (from the Haar measure, which is the unique way to define a measure 
invariant under multiplication by elements of the group). We can then get a random discrete connection in the “silly” 


l in the other one, and we multiply the matrices 


way by orienting each edge and assigning g in one direction and g~ 
along a given loop to get the “holonomy”. The continuum version of this then involves integrating around loops instead 
of multiplying, and we need to assign an N x N matrix (random group element) to each of the d directions for each 


point in R¢. 


Fact 112 

Notice that if we just care about the multiplication along loops, we can perform some operations that do not 
change those values. For example, if we have all edges directed into a given vertex in our lattice, and we multiply 
the matrices at those edges on the right by some e, then the contributions will cancel out along any given loop. 


These are known as gauge transformations, and one way to visualize this is that we're choosing different ways 


to map tangent spaces between different points on our surface (for example, associating the orientation at the 


equator and the orientation at the North Pole). 


Thus, it makes sense to talk about a “gauge equivalence class,” and it’s important as we try to define a random 
connection. When we have something like the Ising model, we usually weight configurations by a factor e~P* (where 
E is the configuration energy), and we might want to do something similar here. Since the trace of the product of 
the elements along any given plaquette (small square in our lattice) is invariant under gauge transformations, 
that's what we'll be taking as our energy (remembering that this trace is the same no matter which point in the box 
we start at), and it’s how we weight our probabilistic law. Our notion of “flatness” then has to do with having maximal 


trace (for example if all edges are labeled with the identity matrix). 


Remark 113. /f our group is abelian, we can take the energies along each plaquette and obtain the energy for larger 
loops by just combining the plaquettes. But in the general case knowing just the values on the small loops is not 


enough. 


The classical problems that we study involve small N — for example, in the Standard Model, we want to do gauge 
theory on the gauge group U(1) x SU(2) x SU(3) in the four-dimensional space R*, and elements of the gauge group 


can be written as a 6 x 6 matrix in block diagonal form. 


Definition 114 


The Wilson loop Wz of a loop (on our discrete lattice) is the expectation of the trace of the product of matrices 


along the loop. 


The fundamental question that we care about is then how we can calculate these Wilson loops. Generally, if 


é,,--- ,2, are a collection of loops in our d-dimensional lattice or space, we might want to compute the function 
F(fy,+-+ ln) = (We, ---W,,). 


This is a well-defined object at the discrete-level, since we have a concrete probability measure on the set of loops. 
But in physics, we want to consider the continuum theory, and we want to see if there is a continuum version of this 
expectation for arbitrary loops. And at the moment, we currently don’t know much — even simulations are difficult to 
perform except on very small grids. 

We'll now look at Sourav Chatterjee’s paper on lattice gauge theory at https: //arxiv.org/pdf/1502.07719.pdf, 


which essentially performs a rigorous proof of the statement “if we want to understand the Wilson loop, we have to 
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sum over surfaces which have this loop as the boundary” (more precisely, look at ways the string can be shrunk back 
to the empty string). We consider equivalence classes of closed paths (where two paths are equivalent up to cycling 
of the vertices), and we'll call a string a sequence of these path classes (£1,--- ,£,). We can then come up with some 
string operations in SO(N) (deforming, splitting twisting, and merging) or SU(N) (deforming, splitting, expansion, 
and merging) and use them to discuss the concept of two strings being “adjacent.” 

Now, recall that discrete harmonicity tells us that we can find the value at a point given the value at Its neighbors 
(because random walks evolve as martingales), so that the value at a given point is basically the average of the values 
we get if we look at all paths that start there and hit the boundary (so assigning trajectories the value where we 
stop). More generally, if we have a nonzero Ah, we can construct some other martingale and still sum over allowed 
trajectories. So if we make a graph out of the string adjacency relations, and the Wilson loop for a string can be 
calculated as a weighted sum over the adjacent strings’ Wilson loops, So the idea Chatterjee has is to use this to get 
a function that satisfies the Migdal-Makeenko equations (which gives us a variant of the Laplacian), and we'll get 
a sum over trajectories all stopped at the empty string. (This is the gauge-string duality.) We won't work through 


the logic in detail, but the fundamental measure we're considering is 


dup,np(Q) x exp {| NB S> Tr(Qp)] [] dow(@e), 


pePr eee 


where oy is the Haar measure on SU(N) or SO(N), Px is the set of positive plaquettes, and E; is the set of 


corresponding edges. 


Remark 115. Let's take a closer look at the notion of “distance,” specifically how far away a unitary matrix is from the 
identity matrix and why traces keep popping up. In the U(1) case, where we have a unit circle in the complex plane, 
we can take either |1 — z|* or 2—2Re(z) for a point z with |z| = 1, so if we have instead a diagonal N x N matrix, 
we can either consider N — Re(Tr(D)) or Tr((/ — D)(1 — D*)), and this turns out to be true for general matrices A. 
Thus, when we look at the Yang-Mills plaquette trace (which gives us a notion of energy), it makes sense that we 
want to give a price depending on the amount of curvature on each plaquette, and that corresponds to how far the 


product of matrices is from the identity as well as how far the trace is from the identity matrix trace. 


We'll dive into this more next time — note that Chatterjee’s “string trajectory” setup makes sense except for a few 
caveats, including some issues with divergent sums. So there’s some renormalization that we need to take care of, 
and specifically we need to have N = oo and very small G in our weighting. (Basically, as we take our mesh finer and 
finer, we want the Wilson loop expectations around a large loop to stay about the same, meaning that each plaquette 


product should be close to the identity. So the scaling limit argument won't work with Chatterjee’s argument.) 
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We'll continue our discussion of Yang-Mills theory today — if we search up “Yang-Mills and mass gap” on Wikipedia, 
we'll see a description similar to ours but starting with the construction in the continuum setting directly. (We can 
consult Arthur Jaffe and Edward Witten’s paper, https://www.claymath.org/sites/default/files/yangmills.pdf, to see 
the original description of the key problem.) Recall that our setup involves a square grid where we place an element 
of a matrix Lie group at every edge, weighting configurations that are “closer to the identity” more heavily by looking 


at traces of plaquettes. And we can get to the continuum case by placing a Lie algebra element at every point in 
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space, and then we end up with the Yang-Mills Lagrangian 
1 


where x is the “Hodge duality operator.” (As a sidenote, Lie algebras basically come from the directions in a tangent 
space from a Lie group, so they're “infinitesimal versions” of those rotation matrices, explaining why they come up in 
the continuum case. The Lie algebra brackets then correspond to commutators of two rotations.) 

In the paper linked above, the problem is stated on page 6 — it is to prove that for any compact simple gauge 
group G, there is a nontrivial quantum Yang-Mills theory on R* with mass gap A > 0, with a requirement of 
establishing at least a list of certain axioms. (This is perhaps the least precisely stated Millenium Prize problem, 
because it’s a case where the problem includes figuring out how to formulate it — that itself is tricky!) 

For now, though, we'll get back to the math of lattice Yang-Mills (assigning group elements to edges), where we 
have a measure weighted by exp (we pert Tr(Qp)). where Q, is the product of the matrices around a plaquette, 
and also usually weighted by the Haar measure (on SU(N) or SO(N)) on each edge. But sometimes we might 
weight with respect to the Gaussian Unitary Ensemble (GUE) (meaning that the distribution is variant under unitary 
transformations, not that the sample is unitary) or Gaussian Orthogonal Ensemble (GOE) or Ginibre ensemble, 
which basically all correspond to sampling entries of the matrix to be Gaussian in slightly different ways. 

Last time, we discussed gauge fixing and gauge equivalence — recall that if we multiply all of the inward directed 
edges into a point in our lattice Yang-Mills theory by some matrix element e, this keeps plaquette traces identical 
(because the products remain the same). So we may want to simplfiy our picture by doing these “gauge transformations” 
— we can always make one of the edges coming out from a vertex into an inverse by multiplying by an appropriate e. 
In two dimensions, to avoid this ambiguity, we can pick a spanning tree of our graph. In particular, we can pick a root 
and perform gauge transformations along the tree until we have /s everywhere along the root edges — now we can 


choose matrices from the Haar measure on the remaining edges and weight by plaquette traces. 


Fact 116 


If we now look at our remaining non-tree edges and number them aj, a,--- by adjacency, we have situations 


where our plaquettes have products looking like ajaj;1. This gives us a weighting of the form [] 8 hi eae), 


and if we choose our spanning tree so that all terms look exactly like this (meaning we can explore one plaquette 
at a time), this is a Markov chain. So we can think of each term here as a jump in a random walk on the group, 


and that gives us a “Brownian motion on a Lie group” (because we're adding up independent increments). 


So this is what mathematicians mean when they say that “two-dimensional Yang-Mills is solved” — we can get 
the values of Wilson loops by “looking at Brownian motion” inside them, corresponding to calculating expectations 
of traces of products of random matrices. Because we know that random walks converge (often exponentially) to a 
stationary measure, and so does Brownian motion on a group, what we find is that the expected Wilson loop trace 
decays exponentially in the size of the area enclosed. But this fact are both hard to prove in higher dimensions — 
it has something to do with the quark confinement problem in physics. (And the mass gap connection to physics is 


related to the exponential decay of Wilson loop correlations that are spatially far apart.) 


Remark 117. To understand how we actually define a law to calculate these correlations, notice that for a random 


variable X taking values in [—1,1], knowing the moments E[X"] gives us E[P(X)] for any polynomial P, which gives 


us E[f(X)] for any f € L? by density. In our case, we are instead able to get expressions like E[ Tr(A)], E[ Tr(A*)], and 
so on (A? just corresponds to going around the loop A twice), which means we get expectations of the form pa AP 


— eventually this indeed gives us everything we want to know about the law of the djs, so this does allow us to make 
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the claim that knowing expectations of the product of Wilson loops is all we need to describe the law. 


Recall that Chatterjee’s argument does indeed talk about these kinds of calculations in the two-dimensional lattice, 
but we can’t use it directly in the continuum setting because it requires us to choose temperature to penalize areas 
strongly (which is not what we want to do in the fine mesh limit). To make things work, we need a way to assign value 
to certain divergent sums, and there are many different ways to do this (analytic continuation of a family of functions, 
Cesaro summation, Borel summation) which are all essentially “cheating.” 


Instead, we'll briefly showcase something that we are able to do — consider a problem like the following: 


Problem 118 


Let A be a sample from the N-dimensional GUE. Compute E[Tr(A*)Tr(A®) Tr(A8)]. 


The idea is that Tr(A*) is a sum over terms ajjajxaxeag (for all i,j,k, £), and we can represent this graphically 
as a directed cycle on a square with vertices /,/,k,2 (taking on one of their N possible values). Similarly, we can 
represent Tr(A°) as a directed cycle on a hexagon and Tr(A®) as a directed cycle on an octagon; we can then compute 
the product of traces to be the sum over all triples of directed cycles. Because each aj is an independent centered 
Gaussian (except for restrictions between Aj; and Aj;), we can use Wick’s theorem, which is a sum over the different 
ways to match the edges into pairs. But because the product of two independent Gaussians has expectation 0, we 
must take matchings of edges so that the labels line up — in other words, we have to glue edges together so that 
the orientations are consistent. This actually gives us a surface — computing traces is indeed a sum over labeled 
surfaces (where the number of ways to label a given surface is just MY, where j is the number of vertices). Euler's 


formula then tells us that / is related to the genus g, and that explains why Chatterjee’s story works! 
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In our last lecture, we'll talk more about the big-picture connections between Yang-Mills, random surfaces, and random 
matrices. Throughout this class, we've already constructed many examples of random surfaces (LQG), curves (SLE), 
planar maps, and so on, but there's also the world of gauge theory and Yang-Mills that we've Just started to explore, 


which is supposed to be (in some way) at the heart of the Standard Model of physics. 


Example 119 


We'll now make another connection motivated by physics — statistical mechanics often talks about random col- 


lections of points, such as the molecules in a Coulomb gas. 


If we want our potential function to be harmonic, and intuitively the flux over a sphere from a point source should 
be independent of the radius (because no energy is lost), the natural Coulomb repulsion force should be proportional 
to + in three dimensions, or 4 in two dimensions. So if we imagine a system of WN particles that are restricted to a 
string or plane but experiencing a two-dimensional force (so that the energy needed to move two particles between 
distance R; to distance R> is log R2 — log Ri), we can now imagine “randomly placing” these N repelling particles in 
a finite domain to get an ensemble. Since as the problem is stated, the particles will just want to be far apart (which 


isn't very interesting), we often instead consider an energy function of the form 


H=S x? +V 
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where x? is the squared distance of particle / from the origin (this is like a “background harmonic oscillator”) and V is 
the total potential energy coming from the Coulomb repulsion. (Then statistical mechanics assigns a configuration of 
these locations a probability proportional to e~ 8") 

It turns out that the motivation for this kind of setup actually comes from random matrices just as much as it 


comes from point particle systems. Let’s review some of the definitions: 


Definition 120 


The Gaussian unitary ensemble (GUE) is the Gaussian measure on n x n Hermitian matrices H with density 


pee) (with respect to the Lebesgue measure), where Z is a normalizing factor. 


(Recall that Hermitian matrices satisfy H!™ =H, and that the name GUE instead comes from this measure being 
invariant under unitary transformations.) The sum Tr(H?) here is the sum of the squared eigenvalues of H, but we 
can also write it out in terms of the matrix entries as 0, ; HijHji = 30, |Hijl*, so in fact the GUE independently 
samples each entry in the upper triangular part of H, requiring diagonal entries to be real and off-diagonal entries to 
be complex and satisfying Hj = Hji. 


We also have a similar definition for real matrices: 


Definition 121 
The Gaussian orthogonal ensemble (GOE) is the Gaussian measure on n x n symmetric matrices H with density 


) (with respect to the Lebesgue measure), where Z is a normalizing factor. 


Lear 


Remark 122. We can also make a similar definition of the Gaussian symplectic ensemble (GSE), which has density 


proportional to e~™t(H?) for Hermitian quaternionic matrices. But all of these stories are similar. 


One of the fundamental results in random matrix theory is the distribution of the eigenvalues, which has joint 
probability density 
1 ae Bas 8 
a II €xp ark Il [Ai — AVP, 
k=1 i<j 
where G depends on which of the three ensembles we're using (1, 2,4 for GOE, GUE, GSE respectively). This second 
term can be rewritten as exp (6 Vig; !og |Ai — j|), and thus if we look at the two terms together we see that the 


>>), AZ corresponds to our “background harmonic oscillator” from above, while the 6 >>,-,log|A; — A;| tells us about 


i<j 
the strength of the electric repulsion between the particles — the eigenvalues look like they’re charged particles 
placed in a quadratic potential. 

We can take this further as well — we might imagine putting different potentials instead of y x, and we might 


want to ask about questions like the expected location of a given particle under that potential. Adding such an extra 


potential corresponds to trying to compute something like E[Tr(A* + A°)] instead of E[Tr(A?)], and recall that we 
talked about how to connect expectations of matrix powers to gluing together random surfaces last lecture. Basically, 


the point is that all of these different perspectives motivated each other! 


Fact 123 


One good key takeaway at this point is the fact that if a, b are independent standard Gaussians, E[(a+ bi)*] = 0. 


So when we do the Wick’s theorem calculation “pairing up our edges” in the directed cycle for GUE, something 


like E[A12A12] will be zero, but something like E[A12A>1] will be nonzero (because A;2Ao1 becomes real). Thus, 


this dictates requirements on orientation of our surface gluing (GUE requires orientability, while GOE does not). 
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Coming back to thinking of E[Tr(A* + A®)] as an energy function, we are often wanting to compute expressions in 
the partition function like (letting P(A;) be a polynomial in »;, so that Tr(P(A)) = >>; P(A;)) 


a[ePO)] = Po 
k=0 — 


The term P(;)* then involves grabbing k of our directed cycles — for example, if P(A;) = 24 + A° + A8, then the 


P(A;)? term picks three polygons and is more likely to form squares than hexagons or octagons, and then we make 


random surfaces out of those. (The k! then basically corresponds to the fact that we can order our polygons in any of 


the k! different ways.) Since k, the number of faces, can be arbitrary, computing that partition function tfePO)] 


does indeed correspond to summing over all possible surfaces. 


But what we're saying is too good to be true, because of convergence issues! If we have a Gaussian X and want 
oa —x?/2 


to compute something like E[e e*’ dx, which diverges as x > 00 (and this 


, we need to do the integral S sme 


happens whenever our degree is higher than 2, which only allows us to make a disappointing set of polygons). We 


could instead try to compute something like H[e~ PON], where P has all positive coefficients, and that expectation 


does exist — from the power series perspective, the “even-face” and “odd-face” surface divergences cancel out, and we 
want the oo — co to be meaningful in some way. 

One way that this has been attempted is by making N very large, so that we can ignore the ways of gluing surfaces 
together which don’t have the maximal number of vertices (so for example, spheres are prioritized over tori because 
of how the vertices are glued together from a square). So this gives us a constraint on the genus of the surface — we 
only need to worry about the simply-connected surfaces in this limit. There’s more work that needs to be done 
here, but this type of asymptotic analysis does allow us to prove some connections — this is known as the t’Hooft 
expansion. (For N = 2,3,4, the kinds of numbers that come up in the Standard Model, this doesn't do very much 
for us. But it does help us out when we have a Statistical mechanics model of a gas of molecules, where we do want 
N to go to infinity.) 


Finally, we'll think about how this story generalizes if we have more than one matrix: 


Example 124 
Suppose we go back to the GUE variant of lattice Yang-Mills and place a random matrix on each of the edges of 


our lattice, and suppose we want the expected trace from the product of matrices around some given plaquettes. 


This this corresponds to wanting to calculate something like Tr(ABCD)Tr(ABADCA) instead of Tr(A*)Tr(A°), 


where A, B, C, D are the matrices around a plaquette. 


We can represent this by again expanding the trace expressions and drawing polygon directed cycles to represent 


terms like E[A,2Bo4C42D21], but to account for the different matrices we now color the edges and require our gluing 


of surfaces to preserve both vertex label and color. So if we want to weight our lattice Yang-Mills theory by some 
factor involving the plaquette traces, we're ending up with a sum of surfaces that we can make with plaquettes. 
Calculating the expectation of a product of Wilson loops then comes down to finding ways to make (sum over) 
random surfaces which have those Wilson loops as our boundary, and again we have to worry about convergence 
for this (we do this often by weighting with polynoimals instead, so that the issue with el(H") does not appear. It 
turns out that we end up seeing connections with the Gaussian free field as well, because we're choosing surfaces 
that involve the determinant of the Laplacian. But solving the full story about scaling limits of random surfaces is (of 


course) still open! 
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